All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
@ 2015-08-14 16:39 ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:39 UTC (permalink / raw)
  To: linux-arm-kernel, devicetree, Will.Deacon, catalin.marinas,
	grant.likely, leif.lindholm, rfranz, ard.biesheuvel, msalter,
	robh+dt, steve.capper, hanjun.guo, al.stone, arnd, pawel.moll,
	mark.rutland, ijc+devicetree, galak
  Cc: gpkulkarni

v5:
	- created base verion of numa.c which creates dummy numa without using dt
	  on single socket platforms. Then added patches for dt support.
	- Incorporated review comments from Hanjun Guo.

v4:
done changes as per Arnd review comments.

v3:
Added changes to support numa on arm64 based platforms.
Tested these patches on cavium's multinode(2 node topology) platform.
In this patchset, defined and implemented dt bindings for numa mapping
for core and memory using device node property arm,associativity.

v2:
Defined and implemented numa map for memory, cores to node and
proximity distance matrix of nodes.

v1:
Initial patchset to support numa on arm64 platforms.

Note:
	1. This patchset is tested for numa with dt on
	   thunderx single socket and dual socket boards.
	2. Numa DT booting needs the dt memory nodes, which are deleted in current efi-stub,
	hence to try numa with dt, you need to rebase with ard's patchset.
	http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling


Ganapatrao Kulkarni (4):
  arm64, numa: adding numa support for arm64 platforms.
  Documentation: arm64/arm: dt bindings for numa.
  arm64, numa: adding numa support for arm64 platforms.
  arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
    topology.

 Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
 arch/arm64/Kconfig                              |  32 +
 arch/arm64/boot/dts/cavium/Makefile             |   2 +-
 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
 arch/arm64/include/asm/mmzone.h                 |  32 +
 arch/arm64/include/asm/numa.h                   |  49 ++
 arch/arm64/kernel/Makefile                      |   1 +
 arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
 arch/arm64/kernel/setup.c                       |   9 +
 arch/arm64/kernel/smp.c                         |   3 +
 arch/arm64/mm/Makefile                          |   1 +
 arch/arm64/mm/init.c                            |  34 +-
 arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
 14 files changed, 2115 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
 create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
 create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
 create mode 100644 arch/arm64/include/asm/mmzone.h
 create mode 100644 arch/arm64/include/asm/numa.h
 create mode 100644 arch/arm64/kernel/dt_numa.c
 create mode 100644 arch/arm64/mm/numa.c

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
@ 2015-08-14 16:39 ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

v5:
	- created base verion of numa.c which creates dummy numa without using dt
	  on single socket platforms. Then added patches for dt support.
	- Incorporated review comments from Hanjun Guo.

v4:
done changes as per Arnd review comments.

v3:
Added changes to support numa on arm64 based platforms.
Tested these patches on cavium's multinode(2 node topology) platform.
In this patchset, defined and implemented dt bindings for numa mapping
for core and memory using device node property arm,associativity.

v2:
Defined and implemented numa map for memory, cores to node and
proximity distance matrix of nodes.

v1:
Initial patchset to support numa on arm64 platforms.

Note:
	1. This patchset is tested for numa with dt on
	   thunderx single socket and dual socket boards.
	2. Numa DT booting needs the dt memory nodes, which are deleted in current efi-stub,
	hence to try numa with dt, you need to rebase with ard's patchset.
	http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling


Ganapatrao Kulkarni (4):
  arm64, numa: adding numa support for arm64 platforms.
  Documentation: arm64/arm: dt bindings for numa.
  arm64, numa: adding numa support for arm64 platforms.
  arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
    topology.

 Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
 arch/arm64/Kconfig                              |  32 +
 arch/arm64/boot/dts/cavium/Makefile             |   2 +-
 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
 arch/arm64/include/asm/mmzone.h                 |  32 +
 arch/arm64/include/asm/numa.h                   |  49 ++
 arch/arm64/kernel/Makefile                      |   1 +
 arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
 arch/arm64/kernel/setup.c                       |   9 +
 arch/arm64/kernel/smp.c                         |   3 +
 arch/arm64/mm/Makefile                          |   1 +
 arch/arm64/mm/init.c                            |  34 +-
 arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
 14 files changed, 2115 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
 create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
 create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
 create mode 100644 arch/arm64/include/asm/mmzone.h
 create mode 100644 arch/arm64/include/asm/numa.h
 create mode 100644 arch/arm64/kernel/dt_numa.c
 create mode 100644 arch/arm64/mm/numa.c

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 1/4] arm64, numa: adding numa support for arm64 platforms.
  2015-08-14 16:39 ` Ganapatrao Kulkarni
@ 2015-08-14 16:39   ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:39 UTC (permalink / raw)
  To: linux-arm-kernel, devicetree, Will.Deacon, catalin.marinas,
	grant.likely, leif.lindholm, rfranz, ard.biesheuvel, msalter,
	robh+dt, steve.capper, hanjun.guo, al.stone, arnd, pawel.moll,
	mark.rutland, ijc+devicetree, galak
  Cc: gpkulkarni

Adding numa support for arm64 based platforms.
This patch adds by default the dummy numa node and
maps all memory and cpus to node 0.
using this patch, numa can be simulated on single node arm64 platforms.

Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
---
 arch/arm64/Kconfig              |  26 ++
 arch/arm64/include/asm/mmzone.h |  32 +++
 arch/arm64/include/asm/numa.h   |  42 +++
 arch/arm64/kernel/setup.c       |   9 +
 arch/arm64/kernel/smp.c         |   2 +
 arch/arm64/mm/Makefile          |   1 +
 arch/arm64/mm/init.c            |  34 ++-
 arch/arm64/mm/numa.c            | 550 ++++++++++++++++++++++++++++++++++++++++
 8 files changed, 690 insertions(+), 6 deletions(-)
 create mode 100644 arch/arm64/include/asm/mmzone.h
 create mode 100644 arch/arm64/include/asm/numa.h
 create mode 100644 arch/arm64/mm/numa.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 43a0c26..fa37a5d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -72,6 +72,7 @@ config ARM64
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_RCU_TABLE_FREE
 	select HAVE_SYSCALL_TRACEPOINTS
+	select HAVE_MEMBLOCK_NODE_MAP if NUMA
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
 	select MODULES_USE_ELF_RELA
@@ -559,6 +560,31 @@ config HOTPLUG_CPU
 	  Say Y here to experiment with turning CPUs off and on.  CPUs
 	  can be controlled through /sys/devices/system/cpu.
 
+# Common NUMA Features
+config NUMA
+	bool "Numa Memory Allocation and Scheduler Support"
+	depends on SMP
+	help
+	  Enable NUMA (Non Uniform Memory Access) support.
+
+	  The kernel will try to allocate memory used by a CPU on the
+	  local memory controller of the CPU and add some more
+	  NUMA awareness to the kernel.
+
+config NODES_SHIFT
+	int "Maximum NUMA Nodes (as a power of 2)"
+	range 1 10
+	default "2"
+	depends on NEED_MULTIPLE_NODES
+	help
+	  Specify the maximum number of NUMA Nodes available on the target
+	  system.  Increases memory reserved to accommodate various tables.
+
+config USE_PERCPU_NUMA_NODE_ID
+	def_bool y
+	depends on NUMA
+
+
 source kernel/Kconfig.preempt
 
 config UP_LATE_INIT
diff --git a/arch/arm64/include/asm/mmzone.h b/arch/arm64/include/asm/mmzone.h
new file mode 100644
index 0000000..d27ee66
--- /dev/null
+++ b/arch/arm64/include/asm/mmzone.h
@@ -0,0 +1,32 @@
+#ifndef __ASM_ARM64_MMZONE_H_
+#define __ASM_ARM64_MMZONE_H_
+
+#ifdef CONFIG_NUMA
+
+#include <linux/mmdebug.h>
+#include <asm/smp.h>
+#include <linux/types.h>
+#include <asm/numa.h>
+
+extern struct pglist_data *node_data[];
+
+#define NODE_DATA(nid)		(node_data[nid])
+
+
+struct numa_memblk {
+	u64			start;
+	u64			end;
+	int			nid;
+};
+
+struct numa_meminfo {
+	int			nr_blks;
+	struct numa_memblk	blk[NR_NODE_MEMBLKS];
+};
+
+void __init numa_remove_memblk_from(int idx, struct numa_meminfo *mi);
+int __init numa_cleanup_meminfo(struct numa_meminfo *mi);
+void __init numa_reset_distance(void);
+
+#endif /* CONFIG_NUMA */
+#endif /* __ASM_ARM64_MMZONE_H_ */
diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h
new file mode 100644
index 0000000..59b834e
--- /dev/null
+++ b/arch/arm64/include/asm/numa.h
@@ -0,0 +1,42 @@
+#ifndef _ASM_NUMA_H
+#define _ASM_NUMA_H
+
+#include <linux/nodemask.h>
+#include <asm/topology.h>
+
+#ifdef CONFIG_NUMA
+
+#define NR_NODE_MEMBLKS		(MAX_NUMNODES * 2)
+#define ZONE_ALIGN (1UL << (MAX_ORDER + PAGE_SHIFT))
+
+/* currently, arm64 implements flat NUMA topology */
+#define parent_node(node)	(node)
+
+/* dummy definitions for pci functions */
+#define pcibus_to_node(node)	0
+#define cpumask_of_pcibus(bus)	0
+
+struct __node_cpu_hwid {
+	u32 node_id;    /* logical node containing this CPU */
+	u64 cpu_hwid;   /* MPIDR for this CPU */
+};
+
+extern struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
+extern nodemask_t numa_nodes_parsed __initdata;
+
+const struct cpumask *cpumask_of_node(int node);
+/* Mappings between node number and cpus on that node. */
+extern cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
+
+void __init arm64_numa_init(void);
+int __init numa_add_memblk(u32 nodeid, u64 start, u64 end);
+void numa_store_cpu_info(int cpu);
+void __init build_cpu_to_node_map(void);
+void __init numa_set_distance(int from, int to, int distance);
+#else	/* CONFIG_NUMA */
+static inline void numa_store_cpu_info(int cpu)		{ }
+static inline void arm64_numa_init(void)		{ }
+static inline void build_cpu_to_node_map(void) { }
+static inline void numa_set_distance(int from, int to, int distance) { }
+#endif	/* CONFIG_NUMA */
+#endif	/* _ASM_NUMA_H */
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 4d4d7ce..6e101eb 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -65,6 +65,7 @@
 #include <asm/efi.h>
 #include <asm/virt.h>
 #include <asm/xen/hypervisor.h>
+#include <asm/numa.h>
 
 unsigned long elf_hwcap __read_mostly;
 EXPORT_SYMBOL_GPL(elf_hwcap);
@@ -439,6 +440,9 @@ static int __init topology_init(void)
 {
 	int i;
 
+	for_each_online_node(i)
+		register_one_node(i);
+
 	for_each_possible_cpu(i) {
 		struct cpu *cpu = &per_cpu(cpu_data.cpu, i);
 		cpu->hotpluggable = 1;
@@ -511,7 +515,12 @@ static int c_show(struct seq_file *m, void *v)
 		 * "processor".  Give glibc what it expects.
 		 */
 #ifdef CONFIG_SMP
+	if (IS_ENABLED(CONFIG_NUMA)) {
+		seq_printf(m, "processor\t: %d", i);
+		seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
+	} else {
 		seq_printf(m, "processor\t: %d\n", i);
+	}
 #endif
 
 		/*
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 50fb469..ae3e02c 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -52,6 +52,7 @@
 #include <asm/sections.h>
 #include <asm/tlbflush.h>
 #include <asm/ptrace.h>
+#include <asm/numa.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/ipi.h>
@@ -124,6 +125,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 static void smp_store_cpu_info(unsigned int cpuid)
 {
 	store_cpu_topology(cpuid);
+	numa_store_cpu_info(cpuid);
 }
 
 /*
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index 773d37a..bb92d41 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -4,3 +4,4 @@ obj-y				:= dma-mapping.o extable.o fault.o init.o \
 				   context.o proc.o pageattr.o
 obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
 obj-$(CONFIG_ARM64_PTDUMP)	+= dump.o
+obj-$(CONFIG_NUMA)		+= numa.o
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 54da32e..cab384b 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -42,6 +42,7 @@
 #include <asm/sizes.h>
 #include <asm/tlb.h>
 #include <asm/alternative.h>
+#include <asm/numa.h>
 
 #include "mm.h"
 
@@ -77,6 +78,20 @@ static phys_addr_t max_zone_dma_phys(void)
 	return min(offset + (1ULL << 32), memblock_end_of_DRAM());
 }
 
+#ifdef CONFIG_NUMA
+static void __init zone_sizes_init(unsigned long min, unsigned long max)
+{
+	unsigned long max_zone_pfns[MAX_NR_ZONES];
+
+	memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
+	if (IS_ENABLED(CONFIG_ZONE_DMA))
+		max_zone_pfns[ZONE_DMA] = PFN_DOWN(max_zone_dma_phys());
+	max_zone_pfns[ZONE_NORMAL] = max;
+
+	free_area_init_nodes(max_zone_pfns);
+}
+
+#else
 static void __init zone_sizes_init(unsigned long min, unsigned long max)
 {
 	struct memblock_region *reg;
@@ -115,6 +130,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 
 	free_area_init_node(0, zone_size, min, zhole_size);
 }
+#endif /* CONFIG_NUMA */
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 int pfn_valid(unsigned long pfn)
@@ -132,10 +148,15 @@ static void arm64_memory_present(void)
 static void arm64_memory_present(void)
 {
 	struct memblock_region *reg;
+	int nid = 0;
 
-	for_each_memblock(memory, reg)
-		memory_present(0, memblock_region_memory_base_pfn(reg),
-			       memblock_region_memory_end_pfn(reg));
+	for_each_memblock(memory, reg) {
+#ifdef CONFIG_NUMA
+		nid = reg->nid;
+#endif
+		memory_present(nid, memblock_region_memory_base_pfn(reg),
+				memblock_region_memory_end_pfn(reg));
+	}
 }
 #endif
 
@@ -200,6 +221,10 @@ void __init bootmem_init(void)
 	min = PFN_UP(memblock_start_of_DRAM());
 	max = PFN_DOWN(memblock_end_of_DRAM());
 
+	high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
+	max_pfn = max_low_pfn = max;
+
+	arm64_numa_init();
 	/*
 	 * Sparsemem tries to allocate bootmem in memory_present(), so must be
 	 * done after the fixed reservations.
@@ -208,9 +233,6 @@ void __init bootmem_init(void)
 
 	sparse_init();
 	zone_sizes_init(min, max);
-
-	high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
-	max_pfn = max_low_pfn = max;
 }
 
 #ifndef CONFIG_SPARSEMEM_VMEMMAP
diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
new file mode 100644
index 0000000..2be83de
--- /dev/null
+++ b/arch/arm64/mm/numa.c
@@ -0,0 +1,550 @@
+/*
+ * NUMA support, based on the x86 implementation.
+ *
+ * Copyright (C) 2015 Cavium Inc.
+ * Author: Ganapatrao Kulkarni <gkulkarni@cavium.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/string.h>
+#include <linux/init.h>
+#include <linux/bootmem.h>
+#include <linux/memblock.h>
+#include <linux/ctype.h>
+#include <linux/module.h>
+#include <linux/nodemask.h>
+#include <linux/sched.h>
+#include <linux/topology.h>
+#include <linux/mmzone.h>
+
+#include <asm/smp_plat.h>
+
+int __initdata numa_off;
+nodemask_t numa_nodes_parsed __initdata;
+static int numa_distance_cnt;
+static u8 *numa_distance;
+static u8 dummy_numa_enabled;
+
+struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
+EXPORT_SYMBOL(node_data);
+
+struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
+static struct numa_meminfo numa_meminfo;
+
+static __init int numa_setup(char *opt)
+{
+	if (!opt)
+		return -EINVAL;
+	if (!strncmp(opt, "off", 3)) {
+		pr_info("%s\n", "NUMA turned off");
+		numa_off = 1;
+	}
+	return 0;
+}
+early_param("numa", numa_setup);
+
+cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
+EXPORT_SYMBOL(node_to_cpumask_map);
+
+int cpu_to_node_map[NR_CPUS];
+EXPORT_SYMBOL(cpu_to_node_map);
+
+/*
+ * Returns a pointer to the bitmask of CPUs on Node 'node'.
+ */
+const struct cpumask *cpumask_of_node(int node)
+{
+	if (node >= nr_node_ids) {
+		pr_warn("cpumask_of_node(%d): node > nr_node_ids(%d)\n",
+			node, nr_node_ids);
+		dump_stack();
+		return cpu_none_mask;
+	}
+	if (node_to_cpumask_map[node] == NULL) {
+		pr_warn("cpumask_of_node(%d): no node_to_cpumask_map!\n",
+			node);
+		dump_stack();
+		return cpu_online_mask;
+	}
+	return node_to_cpumask_map[node];
+}
+EXPORT_SYMBOL(cpumask_of_node);
+
+void numa_clear_node(int cpu)
+{
+	set_cpu_numa_node(cpu, NUMA_NO_NODE);
+}
+
+void map_cpu_to_node(int cpu, int nid)
+{
+	if (nid < 0) { /* just initialize by zero */
+		cpu_to_node_map[cpu] = 0;
+		return;
+	}
+
+	cpu_to_node_map[cpu] = nid;
+	cpumask_set_cpu(cpu, node_to_cpumask_map[nid]);
+	set_numa_node(nid);
+}
+
+/**
+ * build_cpu_to_node_map - setup cpu to node and node to cpumask arrays
+ *
+ * Build cpu to node mapping and initialize the per node cpu masks using
+ * info from the node_cpuid array handed to us by ACPI or DT.
+ */
+void __init build_cpu_to_node_map(void)
+{
+	int cpu, i, node;
+
+	for (node = 0; node < MAX_NUMNODES; node++)
+		cpumask_clear(node_to_cpumask_map[node]);
+
+	for_each_possible_cpu(cpu) {
+		node = NUMA_NO_NODE;
+		for_each_possible_cpu(i) {
+			if (cpu_logical_map(cpu) == node_cpu_hwid[i].cpu_hwid) {
+				node = node_cpu_hwid[i].node_id;
+				break;
+			}
+		}
+		map_cpu_to_node(cpu, node);
+	}
+}
+/*
+ * Allocate node_to_cpumask_map based on number of available nodes
+ * Requires node_possible_map to be valid.
+ *
+ * Note: cpumask_of_node() is not valid until after this is done.
+ * (Use CONFIG_DEBUG_PER_CPU_MAPS to check this.)
+ */
+void __init setup_node_to_cpumask_map(void)
+{
+	unsigned int node;
+
+	/* setup nr_node_ids if not done yet */
+	if (nr_node_ids == MAX_NUMNODES)
+		setup_nr_node_ids();
+
+	/* allocate the map */
+	for (node = 0; node < nr_node_ids; node++)
+		alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]);
+
+	/* cpumask_of_node() will now work */
+	pr_debug("Node to cpumask map for %d nodes\n", nr_node_ids);
+}
+
+/*
+ *  Set the cpu to node and mem mapping
+ */
+void numa_store_cpu_info(int cpu)
+{
+	if (dummy_numa_enabled) {
+		/* set to default */
+		node_cpu_hwid[cpu].node_id  =  0;
+		node_cpu_hwid[cpu].cpu_hwid = cpu_logical_map(cpu);
+	}
+	map_cpu_to_node(cpu, node_cpu_hwid[cpu].node_id);
+}
+
+/**
+ * numa_add_memblk_to - Add one numa_memblk to a numa_meminfo
+ */
+
+static int __init numa_add_memblk_to(int nid, u64 start, u64 end,
+				     struct numa_meminfo *mi)
+{
+	/* ignore zero length blks */
+	if (start == end)
+		return 0;
+
+	/* whine about and ignore invalid blks */
+	if (start > end || nid < 0 || nid >= MAX_NUMNODES) {
+		pr_warn("NUMA: Warning: invalid memblk node %d [mem %#010Lx-%#010Lx]\n",
+				nid, start, end - 1);
+		return 0;
+	}
+
+	if (mi->nr_blks >= NR_NODE_MEMBLKS) {
+		pr_err("NUMA: too many memblk ranges\n");
+		return -EINVAL;
+	}
+
+	pr_info("NUMA: Adding memblock %d [0x%llx - 0x%llx] on node %d\n",
+			mi->nr_blks, start, end, nid);
+	mi->blk[mi->nr_blks].start = start;
+	mi->blk[mi->nr_blks].end = end;
+	mi->blk[mi->nr_blks].nid = nid;
+	mi->nr_blks++;
+	return 0;
+}
+
+/**
+ * numa_add_memblk - Add one numa_memblk to numa_meminfo
+ * @nid: NUMA node ID of the new memblk
+ * @start: Start address of the new memblk
+ * @end: End address of the new memblk
+ *
+ * Add a new memblk to the default numa_meminfo.
+ *
+ * RETURNS:
+ * 0 on success, -errno on failure.
+ */
+#define MAX_PHYS_ADDR	((phys_addr_t)~0)
+
+int __init numa_add_memblk(u32 nid, u64 base, u64 end)
+{
+	const u64 phys_offset = __pa(PAGE_OFFSET);
+
+	base &= PAGE_MASK;
+	end &= PAGE_MASK;
+
+	if (base > MAX_PHYS_ADDR) {
+		pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
+				base, base + end);
+		return -ENOMEM;
+	}
+
+	if (base + end > MAX_PHYS_ADDR) {
+		pr_info("NUMA: Ignoring memory range 0x%lx - 0x%llx\n",
+				ULONG_MAX, base + end);
+		end = MAX_PHYS_ADDR - base;
+	}
+
+	if (base + end < phys_offset) {
+		pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
+			   base, base + end);
+		return -ENOMEM;
+	}
+	if (base < phys_offset) {
+		pr_info("NUMA: Ignoring memory range 0x%llx - 0x%llx\n",
+			   base, phys_offset);
+		end -= phys_offset - base;
+		base = phys_offset;
+	}
+
+	return numa_add_memblk_to(nid, base, base + end, &numa_meminfo);
+}
+EXPORT_SYMBOL(numa_add_memblk);
+
+/* Initialize NODE_DATA for a node on the local memory */
+static void __init setup_node_data(int nid, u64 start, u64 end)
+{
+	const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
+	u64 nd_pa;
+	void *nd;
+	int tnid;
+
+	start = roundup(start, ZONE_ALIGN);
+
+	pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n",
+	       nid, start, end - 1);
+
+	/*
+	 * Allocate node data.  Try node-local memory and then any node.
+	 */
+	nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
+	if (!nd_pa) {
+		nd_pa = __memblock_alloc_base(nd_size, SMP_CACHE_BYTES,
+					      MEMBLOCK_ALLOC_ACCESSIBLE);
+		if (!nd_pa) {
+			pr_err("Cannot find %zu bytes in node %d\n",
+			       nd_size, nid);
+			return;
+		}
+	}
+	nd = __va(nd_pa);
+
+	/* report and initialize */
+	pr_info("  NODE_DATA [mem %#010Lx-%#010Lx]\n",
+	       nd_pa, nd_pa + nd_size - 1);
+	tnid = early_pfn_to_nid(nd_pa >> PAGE_SHIFT);
+	if (tnid != nid)
+		pr_info("    NODE_DATA(%d) on node %d\n", nid, tnid);
+
+	node_data[nid] = nd;
+	memset(NODE_DATA(nid), 0, sizeof(pg_data_t));
+	NODE_DATA(nid)->node_id = nid;
+	NODE_DATA(nid)->node_start_pfn = start >> PAGE_SHIFT;
+	NODE_DATA(nid)->node_spanned_pages = (end - start) >> PAGE_SHIFT;
+
+	node_set_online(nid);
+}
+
+/*
+ * Set nodes, which have memory in @mi, in *@nodemask.
+ */
+static void __init numa_nodemask_from_meminfo(nodemask_t *nodemask,
+					      const struct numa_meminfo *mi)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(mi->blk); i++)
+		if (mi->blk[i].start != mi->blk[i].end &&
+		    mi->blk[i].nid != NUMA_NO_NODE)
+			node_set(mi->blk[i].nid, *nodemask);
+}
+
+/*
+ * Sanity check to catch more bad NUMA configurations (they are amazingly
+ * common).  Make sure the nodes cover all memory.
+ */
+static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
+{
+	u64 numaram, totalram;
+	int i;
+
+	numaram = 0;
+	for (i = 0; i < mi->nr_blks; i++) {
+		u64 s = mi->blk[i].start >> PAGE_SHIFT;
+		u64 e = mi->blk[i].end >> PAGE_SHIFT;
+
+		numaram += e - s;
+		numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
+		if ((s64)numaram < 0)
+			numaram = 0;
+	}
+
+	totalram = max_pfn - absent_pages_in_range(0, max_pfn);
+
+	/* We seem to lose 3 pages somewhere. Allow 1M of slack. */
+	if ((s64)(totalram - numaram) >= (1 << (20 - PAGE_SHIFT))) {
+		pr_err("NUMA: nodes only cover %lluMB of your %lluMB Total RAM. Not used.\n",
+		       (numaram << PAGE_SHIFT) >> 20,
+		       (totalram << PAGE_SHIFT) >> 20);
+		return false;
+	}
+	return true;
+}
+
+/**
+ * numa_reset_distance - Reset NUMA distance table
+ *
+ * The current table is freed.  The next numa_set_distance() call will
+ * create a new one.
+ */
+void __init numa_reset_distance(void)
+{
+	size_t size = numa_distance_cnt * numa_distance_cnt *
+		sizeof(numa_distance[0]);
+
+	/* numa_distance could be 1LU marking allocation failure, test cnt */
+	if (numa_distance_cnt)
+		memblock_free(__pa(numa_distance), size);
+	numa_distance_cnt = 0;
+	numa_distance = NULL;	/* enable table creation */
+}
+
+static int __init numa_alloc_distance(void)
+{
+	nodemask_t nodes_parsed;
+	size_t size;
+	int i, j, cnt = 0;
+	u64 phys;
+
+	/* size the new table and allocate it */
+	nodes_parsed = numa_nodes_parsed;
+	numa_nodemask_from_meminfo(&nodes_parsed, &numa_meminfo);
+
+	for_each_node_mask(i, nodes_parsed)
+		cnt = i;
+	cnt++;
+	size = cnt * cnt * sizeof(numa_distance[0]);
+
+	phys = memblock_find_in_range(0, PFN_PHYS(max_pfn),
+				      size, PAGE_SIZE);
+	if (!phys) {
+		pr_warn("NUMA: Warning: can't allocate distance table!\n");
+		/* don't retry until explicitly reset */
+		numa_distance = (void *)1LU;
+		return -ENOMEM;
+	}
+	memblock_reserve(phys, size);
+
+	numa_distance = __va(phys);
+	numa_distance_cnt = cnt;
+
+	/* fill with the default distances */
+	for (i = 0; i < cnt; i++)
+		for (j = 0; j < cnt; j++)
+			numa_distance[i * cnt + j] = i == j ?
+				LOCAL_DISTANCE : REMOTE_DISTANCE;
+	pr_debug("NUMA: Initialized distance table, cnt=%d\n", cnt);
+
+	return 0;
+}
+
+/**
+ * numa_set_distance - Set NUMA distance from one NUMA to another
+ * @from: the 'from' node to set distance
+ * @to: the 'to'  node to set distance
+ * @distance: NUMA distance
+ *
+ * Set the distance from node @from to @to to @distance.  If distance table
+ * doesn't exist, one which is large enough to accommodate all the currently
+ * known nodes will be created.
+ *
+ * If such table cannot be allocated, a warning is printed and further
+ * calls are ignored until the distance table is reset with
+ * numa_reset_distance().
+ *
+ * If @from or @to is higher than the highest known node or lower than zero
+ * at the time of table creation or @distance doesn't make sense, the call
+ * is ignored.
+ * This is to allow simplification of specific NUMA config implementations.
+ */
+void __init numa_set_distance(int from, int to, int distance)
+{
+	if (!numa_distance && numa_alloc_distance() < 0)
+		return;
+
+	if (from >= numa_distance_cnt || to >= numa_distance_cnt ||
+			from < 0 || to < 0) {
+		pr_warn_once("NUMA: Warning: node ids are out of bound, from=%d to=%d distance=%d\n",
+			    from, to, distance);
+		return;
+	}
+
+	if ((u8)distance != distance ||
+	    (from == to && distance != LOCAL_DISTANCE)) {
+		pr_warn_once("NUMA: Warning: invalid distance parameter, from=%d to=%d distance=%d\n",
+			     from, to, distance);
+		return;
+	}
+
+	numa_distance[from * numa_distance_cnt + to] = distance;
+}
+EXPORT_SYMBOL(numa_set_distance);
+
+int __node_distance(int from, int to)
+{
+	if (from >= numa_distance_cnt || to >= numa_distance_cnt)
+		return from == to ? LOCAL_DISTANCE : REMOTE_DISTANCE;
+	return numa_distance[from * numa_distance_cnt + to];
+}
+EXPORT_SYMBOL(__node_distance);
+
+static int __init numa_register_memblks(struct numa_meminfo *mi)
+{
+	unsigned long uninitialized_var(pfn_align);
+	int i, nid;
+
+	/* Account for nodes with cpus and no memory */
+	node_possible_map = numa_nodes_parsed;
+	numa_nodemask_from_meminfo(&node_possible_map, mi);
+	if (WARN_ON(nodes_empty(node_possible_map)))
+		return -EINVAL;
+
+	for (i = 0; i < mi->nr_blks; i++) {
+		struct numa_memblk *mb = &mi->blk[i];
+
+		memblock_set_node(mb->start, mb->end - mb->start,
+				  &memblock.memory, mb->nid);
+	}
+
+	/*
+	 * If sections array is gonna be used for pfn -> nid mapping, check
+	 * whether its granularity is fine enough.
+	 */
+#ifdef NODE_NOT_IN_PAGE_FLAGS
+	pfn_align = node_map_pfn_alignment();
+	if (pfn_align && pfn_align < PAGES_PER_SECTION) {
+		pr_warn("Node alignment %lluMB < min %lluMB, rejecting NUMA config\n",
+		       PFN_PHYS(pfn_align) >> 20,
+		       PFN_PHYS(PAGES_PER_SECTION) >> 20);
+		return -EINVAL;
+	}
+#endif
+	if (!numa_meminfo_cover_memory(mi))
+		return -EINVAL;
+
+	/* Finally register nodes. */
+	for_each_node_mask(nid, node_possible_map) {
+		u64 start = PFN_PHYS(max_pfn);
+		u64 end = 0;
+
+		for (i = 0; i < mi->nr_blks; i++) {
+			if (nid != mi->blk[i].nid)
+				continue;
+			start = min(mi->blk[i].start, start);
+			end = max(mi->blk[i].end, end);
+		}
+
+		if (start < end)
+			setup_node_data(nid, start, end);
+	}
+
+	/* Dump memblock with node info and return. */
+	memblock_dump_all();
+	return 0;
+}
+
+static int __init numa_init(int (*init_func)(void))
+{
+	int ret, i;
+
+	nodes_clear(numa_nodes_parsed);
+	nodes_clear(node_possible_map);
+	nodes_clear(node_online_map);
+
+	ret = init_func();
+	if (ret < 0)
+		return ret;
+
+	ret = numa_register_memblks(&numa_meminfo);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_cpu_ids; i++)
+		numa_clear_node(i);
+
+	setup_node_to_cpumask_map();
+	build_cpu_to_node_map();
+	return 0;
+}
+
+/**
+ * dummy_numa_init - Fallback dummy NUMA init
+ *
+ * Used if there's no underlying NUMA architecture, NUMA initialization
+ * fails, or NUMA is disabled on the command line.
+ *
+ * Must online at least one node and add memory blocks that cover all
+ * allowed memory.  This function must not fail.
+ */
+static int __init dummy_numa_init(void)
+{
+	pr_info("%s\n", "No NUMA configuration found");
+	pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n",
+	       0LLU, PFN_PHYS(max_pfn) - 1);
+	node_set(0, numa_nodes_parsed);
+	numa_add_memblk(0, 0, PFN_PHYS(max_pfn));
+	dummy_numa_enabled = 1;
+
+	return 0;
+}
+
+/**
+ * arm64_numa_init - Initialize NUMA
+ *
+ * Try each configured NUMA initialization method until one succeeds.  The
+ * last fallback is dummy single node config encomapssing whole memory and
+ * never fails.
+ */
+void __init arm64_numa_init(void)
+{
+	numa_init(dummy_numa_init);
+}
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v5 1/4] arm64, numa: adding numa support for arm64 platforms.
@ 2015-08-14 16:39   ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

Adding numa support for arm64 based platforms.
This patch adds by default the dummy numa node and
maps all memory and cpus to node 0.
using this patch, numa can be simulated on single node arm64 platforms.

Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
---
 arch/arm64/Kconfig              |  26 ++
 arch/arm64/include/asm/mmzone.h |  32 +++
 arch/arm64/include/asm/numa.h   |  42 +++
 arch/arm64/kernel/setup.c       |   9 +
 arch/arm64/kernel/smp.c         |   2 +
 arch/arm64/mm/Makefile          |   1 +
 arch/arm64/mm/init.c            |  34 ++-
 arch/arm64/mm/numa.c            | 550 ++++++++++++++++++++++++++++++++++++++++
 8 files changed, 690 insertions(+), 6 deletions(-)
 create mode 100644 arch/arm64/include/asm/mmzone.h
 create mode 100644 arch/arm64/include/asm/numa.h
 create mode 100644 arch/arm64/mm/numa.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 43a0c26..fa37a5d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -72,6 +72,7 @@ config ARM64
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_RCU_TABLE_FREE
 	select HAVE_SYSCALL_TRACEPOINTS
+	select HAVE_MEMBLOCK_NODE_MAP if NUMA
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
 	select MODULES_USE_ELF_RELA
@@ -559,6 +560,31 @@ config HOTPLUG_CPU
 	  Say Y here to experiment with turning CPUs off and on.  CPUs
 	  can be controlled through /sys/devices/system/cpu.
 
+# Common NUMA Features
+config NUMA
+	bool "Numa Memory Allocation and Scheduler Support"
+	depends on SMP
+	help
+	  Enable NUMA (Non Uniform Memory Access) support.
+
+	  The kernel will try to allocate memory used by a CPU on the
+	  local memory controller of the CPU and add some more
+	  NUMA awareness to the kernel.
+
+config NODES_SHIFT
+	int "Maximum NUMA Nodes (as a power of 2)"
+	range 1 10
+	default "2"
+	depends on NEED_MULTIPLE_NODES
+	help
+	  Specify the maximum number of NUMA Nodes available on the target
+	  system.  Increases memory reserved to accommodate various tables.
+
+config USE_PERCPU_NUMA_NODE_ID
+	def_bool y
+	depends on NUMA
+
+
 source kernel/Kconfig.preempt
 
 config UP_LATE_INIT
diff --git a/arch/arm64/include/asm/mmzone.h b/arch/arm64/include/asm/mmzone.h
new file mode 100644
index 0000000..d27ee66
--- /dev/null
+++ b/arch/arm64/include/asm/mmzone.h
@@ -0,0 +1,32 @@
+#ifndef __ASM_ARM64_MMZONE_H_
+#define __ASM_ARM64_MMZONE_H_
+
+#ifdef CONFIG_NUMA
+
+#include <linux/mmdebug.h>
+#include <asm/smp.h>
+#include <linux/types.h>
+#include <asm/numa.h>
+
+extern struct pglist_data *node_data[];
+
+#define NODE_DATA(nid)		(node_data[nid])
+
+
+struct numa_memblk {
+	u64			start;
+	u64			end;
+	int			nid;
+};
+
+struct numa_meminfo {
+	int			nr_blks;
+	struct numa_memblk	blk[NR_NODE_MEMBLKS];
+};
+
+void __init numa_remove_memblk_from(int idx, struct numa_meminfo *mi);
+int __init numa_cleanup_meminfo(struct numa_meminfo *mi);
+void __init numa_reset_distance(void);
+
+#endif /* CONFIG_NUMA */
+#endif /* __ASM_ARM64_MMZONE_H_ */
diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h
new file mode 100644
index 0000000..59b834e
--- /dev/null
+++ b/arch/arm64/include/asm/numa.h
@@ -0,0 +1,42 @@
+#ifndef _ASM_NUMA_H
+#define _ASM_NUMA_H
+
+#include <linux/nodemask.h>
+#include <asm/topology.h>
+
+#ifdef CONFIG_NUMA
+
+#define NR_NODE_MEMBLKS		(MAX_NUMNODES * 2)
+#define ZONE_ALIGN (1UL << (MAX_ORDER + PAGE_SHIFT))
+
+/* currently, arm64 implements flat NUMA topology */
+#define parent_node(node)	(node)
+
+/* dummy definitions for pci functions */
+#define pcibus_to_node(node)	0
+#define cpumask_of_pcibus(bus)	0
+
+struct __node_cpu_hwid {
+	u32 node_id;    /* logical node containing this CPU */
+	u64 cpu_hwid;   /* MPIDR for this CPU */
+};
+
+extern struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
+extern nodemask_t numa_nodes_parsed __initdata;
+
+const struct cpumask *cpumask_of_node(int node);
+/* Mappings between node number and cpus on that node. */
+extern cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
+
+void __init arm64_numa_init(void);
+int __init numa_add_memblk(u32 nodeid, u64 start, u64 end);
+void numa_store_cpu_info(int cpu);
+void __init build_cpu_to_node_map(void);
+void __init numa_set_distance(int from, int to, int distance);
+#else	/* CONFIG_NUMA */
+static inline void numa_store_cpu_info(int cpu)		{ }
+static inline void arm64_numa_init(void)		{ }
+static inline void build_cpu_to_node_map(void) { }
+static inline void numa_set_distance(int from, int to, int distance) { }
+#endif	/* CONFIG_NUMA */
+#endif	/* _ASM_NUMA_H */
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 4d4d7ce..6e101eb 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -65,6 +65,7 @@
 #include <asm/efi.h>
 #include <asm/virt.h>
 #include <asm/xen/hypervisor.h>
+#include <asm/numa.h>
 
 unsigned long elf_hwcap __read_mostly;
 EXPORT_SYMBOL_GPL(elf_hwcap);
@@ -439,6 +440,9 @@ static int __init topology_init(void)
 {
 	int i;
 
+	for_each_online_node(i)
+		register_one_node(i);
+
 	for_each_possible_cpu(i) {
 		struct cpu *cpu = &per_cpu(cpu_data.cpu, i);
 		cpu->hotpluggable = 1;
@@ -511,7 +515,12 @@ static int c_show(struct seq_file *m, void *v)
 		 * "processor".  Give glibc what it expects.
 		 */
 #ifdef CONFIG_SMP
+	if (IS_ENABLED(CONFIG_NUMA)) {
+		seq_printf(m, "processor\t: %d", i);
+		seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
+	} else {
 		seq_printf(m, "processor\t: %d\n", i);
+	}
 #endif
 
 		/*
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 50fb469..ae3e02c 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -52,6 +52,7 @@
 #include <asm/sections.h>
 #include <asm/tlbflush.h>
 #include <asm/ptrace.h>
+#include <asm/numa.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/ipi.h>
@@ -124,6 +125,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 static void smp_store_cpu_info(unsigned int cpuid)
 {
 	store_cpu_topology(cpuid);
+	numa_store_cpu_info(cpuid);
 }
 
 /*
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index 773d37a..bb92d41 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -4,3 +4,4 @@ obj-y				:= dma-mapping.o extable.o fault.o init.o \
 				   context.o proc.o pageattr.o
 obj-$(CONFIG_HUGETLB_PAGE)	+= hugetlbpage.o
 obj-$(CONFIG_ARM64_PTDUMP)	+= dump.o
+obj-$(CONFIG_NUMA)		+= numa.o
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 54da32e..cab384b 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -42,6 +42,7 @@
 #include <asm/sizes.h>
 #include <asm/tlb.h>
 #include <asm/alternative.h>
+#include <asm/numa.h>
 
 #include "mm.h"
 
@@ -77,6 +78,20 @@ static phys_addr_t max_zone_dma_phys(void)
 	return min(offset + (1ULL << 32), memblock_end_of_DRAM());
 }
 
+#ifdef CONFIG_NUMA
+static void __init zone_sizes_init(unsigned long min, unsigned long max)
+{
+	unsigned long max_zone_pfns[MAX_NR_ZONES];
+
+	memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
+	if (IS_ENABLED(CONFIG_ZONE_DMA))
+		max_zone_pfns[ZONE_DMA] = PFN_DOWN(max_zone_dma_phys());
+	max_zone_pfns[ZONE_NORMAL] = max;
+
+	free_area_init_nodes(max_zone_pfns);
+}
+
+#else
 static void __init zone_sizes_init(unsigned long min, unsigned long max)
 {
 	struct memblock_region *reg;
@@ -115,6 +130,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 
 	free_area_init_node(0, zone_size, min, zhole_size);
 }
+#endif /* CONFIG_NUMA */
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 int pfn_valid(unsigned long pfn)
@@ -132,10 +148,15 @@ static void arm64_memory_present(void)
 static void arm64_memory_present(void)
 {
 	struct memblock_region *reg;
+	int nid = 0;
 
-	for_each_memblock(memory, reg)
-		memory_present(0, memblock_region_memory_base_pfn(reg),
-			       memblock_region_memory_end_pfn(reg));
+	for_each_memblock(memory, reg) {
+#ifdef CONFIG_NUMA
+		nid = reg->nid;
+#endif
+		memory_present(nid, memblock_region_memory_base_pfn(reg),
+				memblock_region_memory_end_pfn(reg));
+	}
 }
 #endif
 
@@ -200,6 +221,10 @@ void __init bootmem_init(void)
 	min = PFN_UP(memblock_start_of_DRAM());
 	max = PFN_DOWN(memblock_end_of_DRAM());
 
+	high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
+	max_pfn = max_low_pfn = max;
+
+	arm64_numa_init();
 	/*
 	 * Sparsemem tries to allocate bootmem in memory_present(), so must be
 	 * done after the fixed reservations.
@@ -208,9 +233,6 @@ void __init bootmem_init(void)
 
 	sparse_init();
 	zone_sizes_init(min, max);
-
-	high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
-	max_pfn = max_low_pfn = max;
 }
 
 #ifndef CONFIG_SPARSEMEM_VMEMMAP
diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
new file mode 100644
index 0000000..2be83de
--- /dev/null
+++ b/arch/arm64/mm/numa.c
@@ -0,0 +1,550 @@
+/*
+ * NUMA support, based on the x86 implementation.
+ *
+ * Copyright (C) 2015 Cavium Inc.
+ * Author: Ganapatrao Kulkarni <gkulkarni@cavium.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/string.h>
+#include <linux/init.h>
+#include <linux/bootmem.h>
+#include <linux/memblock.h>
+#include <linux/ctype.h>
+#include <linux/module.h>
+#include <linux/nodemask.h>
+#include <linux/sched.h>
+#include <linux/topology.h>
+#include <linux/mmzone.h>
+
+#include <asm/smp_plat.h>
+
+int __initdata numa_off;
+nodemask_t numa_nodes_parsed __initdata;
+static int numa_distance_cnt;
+static u8 *numa_distance;
+static u8 dummy_numa_enabled;
+
+struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
+EXPORT_SYMBOL(node_data);
+
+struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
+static struct numa_meminfo numa_meminfo;
+
+static __init int numa_setup(char *opt)
+{
+	if (!opt)
+		return -EINVAL;
+	if (!strncmp(opt, "off", 3)) {
+		pr_info("%s\n", "NUMA turned off");
+		numa_off = 1;
+	}
+	return 0;
+}
+early_param("numa", numa_setup);
+
+cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
+EXPORT_SYMBOL(node_to_cpumask_map);
+
+int cpu_to_node_map[NR_CPUS];
+EXPORT_SYMBOL(cpu_to_node_map);
+
+/*
+ * Returns a pointer to the bitmask of CPUs on Node 'node'.
+ */
+const struct cpumask *cpumask_of_node(int node)
+{
+	if (node >= nr_node_ids) {
+		pr_warn("cpumask_of_node(%d): node > nr_node_ids(%d)\n",
+			node, nr_node_ids);
+		dump_stack();
+		return cpu_none_mask;
+	}
+	if (node_to_cpumask_map[node] == NULL) {
+		pr_warn("cpumask_of_node(%d): no node_to_cpumask_map!\n",
+			node);
+		dump_stack();
+		return cpu_online_mask;
+	}
+	return node_to_cpumask_map[node];
+}
+EXPORT_SYMBOL(cpumask_of_node);
+
+void numa_clear_node(int cpu)
+{
+	set_cpu_numa_node(cpu, NUMA_NO_NODE);
+}
+
+void map_cpu_to_node(int cpu, int nid)
+{
+	if (nid < 0) { /* just initialize by zero */
+		cpu_to_node_map[cpu] = 0;
+		return;
+	}
+
+	cpu_to_node_map[cpu] = nid;
+	cpumask_set_cpu(cpu, node_to_cpumask_map[nid]);
+	set_numa_node(nid);
+}
+
+/**
+ * build_cpu_to_node_map - setup cpu to node and node to cpumask arrays
+ *
+ * Build cpu to node mapping and initialize the per node cpu masks using
+ * info from the node_cpuid array handed to us by ACPI or DT.
+ */
+void __init build_cpu_to_node_map(void)
+{
+	int cpu, i, node;
+
+	for (node = 0; node < MAX_NUMNODES; node++)
+		cpumask_clear(node_to_cpumask_map[node]);
+
+	for_each_possible_cpu(cpu) {
+		node = NUMA_NO_NODE;
+		for_each_possible_cpu(i) {
+			if (cpu_logical_map(cpu) == node_cpu_hwid[i].cpu_hwid) {
+				node = node_cpu_hwid[i].node_id;
+				break;
+			}
+		}
+		map_cpu_to_node(cpu, node);
+	}
+}
+/*
+ * Allocate node_to_cpumask_map based on number of available nodes
+ * Requires node_possible_map to be valid.
+ *
+ * Note: cpumask_of_node() is not valid until after this is done.
+ * (Use CONFIG_DEBUG_PER_CPU_MAPS to check this.)
+ */
+void __init setup_node_to_cpumask_map(void)
+{
+	unsigned int node;
+
+	/* setup nr_node_ids if not done yet */
+	if (nr_node_ids == MAX_NUMNODES)
+		setup_nr_node_ids();
+
+	/* allocate the map */
+	for (node = 0; node < nr_node_ids; node++)
+		alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]);
+
+	/* cpumask_of_node() will now work */
+	pr_debug("Node to cpumask map for %d nodes\n", nr_node_ids);
+}
+
+/*
+ *  Set the cpu to node and mem mapping
+ */
+void numa_store_cpu_info(int cpu)
+{
+	if (dummy_numa_enabled) {
+		/* set to default */
+		node_cpu_hwid[cpu].node_id  =  0;
+		node_cpu_hwid[cpu].cpu_hwid = cpu_logical_map(cpu);
+	}
+	map_cpu_to_node(cpu, node_cpu_hwid[cpu].node_id);
+}
+
+/**
+ * numa_add_memblk_to - Add one numa_memblk to a numa_meminfo
+ */
+
+static int __init numa_add_memblk_to(int nid, u64 start, u64 end,
+				     struct numa_meminfo *mi)
+{
+	/* ignore zero length blks */
+	if (start == end)
+		return 0;
+
+	/* whine about and ignore invalid blks */
+	if (start > end || nid < 0 || nid >= MAX_NUMNODES) {
+		pr_warn("NUMA: Warning: invalid memblk node %d [mem %#010Lx-%#010Lx]\n",
+				nid, start, end - 1);
+		return 0;
+	}
+
+	if (mi->nr_blks >= NR_NODE_MEMBLKS) {
+		pr_err("NUMA: too many memblk ranges\n");
+		return -EINVAL;
+	}
+
+	pr_info("NUMA: Adding memblock %d [0x%llx - 0x%llx] on node %d\n",
+			mi->nr_blks, start, end, nid);
+	mi->blk[mi->nr_blks].start = start;
+	mi->blk[mi->nr_blks].end = end;
+	mi->blk[mi->nr_blks].nid = nid;
+	mi->nr_blks++;
+	return 0;
+}
+
+/**
+ * numa_add_memblk - Add one numa_memblk to numa_meminfo
+ * @nid: NUMA node ID of the new memblk
+ * @start: Start address of the new memblk
+ * @end: End address of the new memblk
+ *
+ * Add a new memblk to the default numa_meminfo.
+ *
+ * RETURNS:
+ * 0 on success, -errno on failure.
+ */
+#define MAX_PHYS_ADDR	((phys_addr_t)~0)
+
+int __init numa_add_memblk(u32 nid, u64 base, u64 end)
+{
+	const u64 phys_offset = __pa(PAGE_OFFSET);
+
+	base &= PAGE_MASK;
+	end &= PAGE_MASK;
+
+	if (base > MAX_PHYS_ADDR) {
+		pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
+				base, base + end);
+		return -ENOMEM;
+	}
+
+	if (base + end > MAX_PHYS_ADDR) {
+		pr_info("NUMA: Ignoring memory range 0x%lx - 0x%llx\n",
+				ULONG_MAX, base + end);
+		end = MAX_PHYS_ADDR - base;
+	}
+
+	if (base + end < phys_offset) {
+		pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
+			   base, base + end);
+		return -ENOMEM;
+	}
+	if (base < phys_offset) {
+		pr_info("NUMA: Ignoring memory range 0x%llx - 0x%llx\n",
+			   base, phys_offset);
+		end -= phys_offset - base;
+		base = phys_offset;
+	}
+
+	return numa_add_memblk_to(nid, base, base + end, &numa_meminfo);
+}
+EXPORT_SYMBOL(numa_add_memblk);
+
+/* Initialize NODE_DATA for a node on the local memory */
+static void __init setup_node_data(int nid, u64 start, u64 end)
+{
+	const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
+	u64 nd_pa;
+	void *nd;
+	int tnid;
+
+	start = roundup(start, ZONE_ALIGN);
+
+	pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n",
+	       nid, start, end - 1);
+
+	/*
+	 * Allocate node data.  Try node-local memory and then any node.
+	 */
+	nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
+	if (!nd_pa) {
+		nd_pa = __memblock_alloc_base(nd_size, SMP_CACHE_BYTES,
+					      MEMBLOCK_ALLOC_ACCESSIBLE);
+		if (!nd_pa) {
+			pr_err("Cannot find %zu bytes in node %d\n",
+			       nd_size, nid);
+			return;
+		}
+	}
+	nd = __va(nd_pa);
+
+	/* report and initialize */
+	pr_info("  NODE_DATA [mem %#010Lx-%#010Lx]\n",
+	       nd_pa, nd_pa + nd_size - 1);
+	tnid = early_pfn_to_nid(nd_pa >> PAGE_SHIFT);
+	if (tnid != nid)
+		pr_info("    NODE_DATA(%d) on node %d\n", nid, tnid);
+
+	node_data[nid] = nd;
+	memset(NODE_DATA(nid), 0, sizeof(pg_data_t));
+	NODE_DATA(nid)->node_id = nid;
+	NODE_DATA(nid)->node_start_pfn = start >> PAGE_SHIFT;
+	NODE_DATA(nid)->node_spanned_pages = (end - start) >> PAGE_SHIFT;
+
+	node_set_online(nid);
+}
+
+/*
+ * Set nodes, which have memory in @mi, in *@nodemask.
+ */
+static void __init numa_nodemask_from_meminfo(nodemask_t *nodemask,
+					      const struct numa_meminfo *mi)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(mi->blk); i++)
+		if (mi->blk[i].start != mi->blk[i].end &&
+		    mi->blk[i].nid != NUMA_NO_NODE)
+			node_set(mi->blk[i].nid, *nodemask);
+}
+
+/*
+ * Sanity check to catch more bad NUMA configurations (they are amazingly
+ * common).  Make sure the nodes cover all memory.
+ */
+static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
+{
+	u64 numaram, totalram;
+	int i;
+
+	numaram = 0;
+	for (i = 0; i < mi->nr_blks; i++) {
+		u64 s = mi->blk[i].start >> PAGE_SHIFT;
+		u64 e = mi->blk[i].end >> PAGE_SHIFT;
+
+		numaram += e - s;
+		numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
+		if ((s64)numaram < 0)
+			numaram = 0;
+	}
+
+	totalram = max_pfn - absent_pages_in_range(0, max_pfn);
+
+	/* We seem to lose 3 pages somewhere. Allow 1M of slack. */
+	if ((s64)(totalram - numaram) >= (1 << (20 - PAGE_SHIFT))) {
+		pr_err("NUMA: nodes only cover %lluMB of your %lluMB Total RAM. Not used.\n",
+		       (numaram << PAGE_SHIFT) >> 20,
+		       (totalram << PAGE_SHIFT) >> 20);
+		return false;
+	}
+	return true;
+}
+
+/**
+ * numa_reset_distance - Reset NUMA distance table
+ *
+ * The current table is freed.  The next numa_set_distance() call will
+ * create a new one.
+ */
+void __init numa_reset_distance(void)
+{
+	size_t size = numa_distance_cnt * numa_distance_cnt *
+		sizeof(numa_distance[0]);
+
+	/* numa_distance could be 1LU marking allocation failure, test cnt */
+	if (numa_distance_cnt)
+		memblock_free(__pa(numa_distance), size);
+	numa_distance_cnt = 0;
+	numa_distance = NULL;	/* enable table creation */
+}
+
+static int __init numa_alloc_distance(void)
+{
+	nodemask_t nodes_parsed;
+	size_t size;
+	int i, j, cnt = 0;
+	u64 phys;
+
+	/* size the new table and allocate it */
+	nodes_parsed = numa_nodes_parsed;
+	numa_nodemask_from_meminfo(&nodes_parsed, &numa_meminfo);
+
+	for_each_node_mask(i, nodes_parsed)
+		cnt = i;
+	cnt++;
+	size = cnt * cnt * sizeof(numa_distance[0]);
+
+	phys = memblock_find_in_range(0, PFN_PHYS(max_pfn),
+				      size, PAGE_SIZE);
+	if (!phys) {
+		pr_warn("NUMA: Warning: can't allocate distance table!\n");
+		/* don't retry until explicitly reset */
+		numa_distance = (void *)1LU;
+		return -ENOMEM;
+	}
+	memblock_reserve(phys, size);
+
+	numa_distance = __va(phys);
+	numa_distance_cnt = cnt;
+
+	/* fill with the default distances */
+	for (i = 0; i < cnt; i++)
+		for (j = 0; j < cnt; j++)
+			numa_distance[i * cnt + j] = i == j ?
+				LOCAL_DISTANCE : REMOTE_DISTANCE;
+	pr_debug("NUMA: Initialized distance table, cnt=%d\n", cnt);
+
+	return 0;
+}
+
+/**
+ * numa_set_distance - Set NUMA distance from one NUMA to another
+ * @from: the 'from' node to set distance
+ * @to: the 'to'  node to set distance
+ * @distance: NUMA distance
+ *
+ * Set the distance from node @from to @to to @distance.  If distance table
+ * doesn't exist, one which is large enough to accommodate all the currently
+ * known nodes will be created.
+ *
+ * If such table cannot be allocated, a warning is printed and further
+ * calls are ignored until the distance table is reset with
+ * numa_reset_distance().
+ *
+ * If @from or @to is higher than the highest known node or lower than zero
+ *@the time of table creation or @distance doesn't make sense, the call
+ * is ignored.
+ * This is to allow simplification of specific NUMA config implementations.
+ */
+void __init numa_set_distance(int from, int to, int distance)
+{
+	if (!numa_distance && numa_alloc_distance() < 0)
+		return;
+
+	if (from >= numa_distance_cnt || to >= numa_distance_cnt ||
+			from < 0 || to < 0) {
+		pr_warn_once("NUMA: Warning: node ids are out of bound, from=%d to=%d distance=%d\n",
+			    from, to, distance);
+		return;
+	}
+
+	if ((u8)distance != distance ||
+	    (from == to && distance != LOCAL_DISTANCE)) {
+		pr_warn_once("NUMA: Warning: invalid distance parameter, from=%d to=%d distance=%d\n",
+			     from, to, distance);
+		return;
+	}
+
+	numa_distance[from * numa_distance_cnt + to] = distance;
+}
+EXPORT_SYMBOL(numa_set_distance);
+
+int __node_distance(int from, int to)
+{
+	if (from >= numa_distance_cnt || to >= numa_distance_cnt)
+		return from == to ? LOCAL_DISTANCE : REMOTE_DISTANCE;
+	return numa_distance[from * numa_distance_cnt + to];
+}
+EXPORT_SYMBOL(__node_distance);
+
+static int __init numa_register_memblks(struct numa_meminfo *mi)
+{
+	unsigned long uninitialized_var(pfn_align);
+	int i, nid;
+
+	/* Account for nodes with cpus and no memory */
+	node_possible_map = numa_nodes_parsed;
+	numa_nodemask_from_meminfo(&node_possible_map, mi);
+	if (WARN_ON(nodes_empty(node_possible_map)))
+		return -EINVAL;
+
+	for (i = 0; i < mi->nr_blks; i++) {
+		struct numa_memblk *mb = &mi->blk[i];
+
+		memblock_set_node(mb->start, mb->end - mb->start,
+				  &memblock.memory, mb->nid);
+	}
+
+	/*
+	 * If sections array is gonna be used for pfn -> nid mapping, check
+	 * whether its granularity is fine enough.
+	 */
+#ifdef NODE_NOT_IN_PAGE_FLAGS
+	pfn_align = node_map_pfn_alignment();
+	if (pfn_align && pfn_align < PAGES_PER_SECTION) {
+		pr_warn("Node alignment %lluMB < min %lluMB, rejecting NUMA config\n",
+		       PFN_PHYS(pfn_align) >> 20,
+		       PFN_PHYS(PAGES_PER_SECTION) >> 20);
+		return -EINVAL;
+	}
+#endif
+	if (!numa_meminfo_cover_memory(mi))
+		return -EINVAL;
+
+	/* Finally register nodes. */
+	for_each_node_mask(nid, node_possible_map) {
+		u64 start = PFN_PHYS(max_pfn);
+		u64 end = 0;
+
+		for (i = 0; i < mi->nr_blks; i++) {
+			if (nid != mi->blk[i].nid)
+				continue;
+			start = min(mi->blk[i].start, start);
+			end = max(mi->blk[i].end, end);
+		}
+
+		if (start < end)
+			setup_node_data(nid, start, end);
+	}
+
+	/* Dump memblock with node info and return. */
+	memblock_dump_all();
+	return 0;
+}
+
+static int __init numa_init(int (*init_func)(void))
+{
+	int ret, i;
+
+	nodes_clear(numa_nodes_parsed);
+	nodes_clear(node_possible_map);
+	nodes_clear(node_online_map);
+
+	ret = init_func();
+	if (ret < 0)
+		return ret;
+
+	ret = numa_register_memblks(&numa_meminfo);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < nr_cpu_ids; i++)
+		numa_clear_node(i);
+
+	setup_node_to_cpumask_map();
+	build_cpu_to_node_map();
+	return 0;
+}
+
+/**
+ * dummy_numa_init - Fallback dummy NUMA init
+ *
+ * Used if there's no underlying NUMA architecture, NUMA initialization
+ * fails, or NUMA is disabled on the command line.
+ *
+ * Must online at least one node and add memory blocks that cover all
+ * allowed memory.  This function must not fail.
+ */
+static int __init dummy_numa_init(void)
+{
+	pr_info("%s\n", "No NUMA configuration found");
+	pr_info("Faking a node@[mem %#018Lx-%#018Lx]\n",
+	       0LLU, PFN_PHYS(max_pfn) - 1);
+	node_set(0, numa_nodes_parsed);
+	numa_add_memblk(0, 0, PFN_PHYS(max_pfn));
+	dummy_numa_enabled = 1;
+
+	return 0;
+}
+
+/**
+ * arm64_numa_init - Initialize NUMA
+ *
+ * Try each configured NUMA initialization method until one succeeds.  The
+ * last fallback is dummy single node config encomapssing whole memory and
+ * never fails.
+ */
+void __init arm64_numa_init(void)
+{
+	numa_init(dummy_numa_init);
+}
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-14 16:39 ` Ganapatrao Kulkarni
@ 2015-08-14 16:39   ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:39 UTC (permalink / raw)
  To: linux-arm-kernel, devicetree, Will.Deacon, catalin.marinas,
	grant.likely, leif.lindholm, rfranz, ard.biesheuvel, msalter,
	robh+dt, steve.capper, hanjun.guo, al.stone, arnd, pawel.moll,
	mark.rutland, ijc+devicetree, galak
  Cc: gpkulkarni

DT bindings for numa map for memory, cores and IOs using
arm,associativity device node property.

Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
---
 Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
 1 file changed, 212 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/numa.txt

diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
new file mode 100644
index 0000000..dc3ef86
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/numa.txt
@@ -0,0 +1,212 @@
+==============================================================================
+NUMA binding description.
+==============================================================================
+
+==============================================================================
+1 - Introduction
+==============================================================================
+
+Systems employing a Non Uniform Memory Access (NUMA) architecture contain
+collections of hardware resources including processors, memory, and I/O buses,
+that comprise what is commonly known as a NUMA node.
+Processor accesses to memory within the local NUMA node is generally faster
+than processor accesses to memory outside of the local NUMA node.
+DT defines interfaces that allow the platform to convey NUMA node
+topology information to OS.
+
+==============================================================================
+2 - arm,associativity
+==============================================================================
+The mapping is done using arm,associativity device property.
+this property needs to be present in every device node which needs to to be
+mapped to numa nodes.
+
+arm,associativity property is set of 32-bit integers which defines level of
+topology and boundary in the system at which a significant difference in
+performance can be measured between cross-device accesses within
+a single location and those spanning multiple locations.
+The first cell always contains the broadest subdivision within the system,
+while the last cell enumerates the individual devices, such as an SMT thread
+of a CPU, or a bus bridge within an SoC".
+
+ex:
+	/* board 0, socket 0, cluster 0, core 0  thread 0 */
+	arm,associativity = <0 0 0 0 0>;
+
+==============================================================================
+3 - arm,associativity-reference-points
+==============================================================================
+This property is a set of 32-bit integers, each representing an index into
+the arm,associativity nodes. The first integer is the most significant
+NUMA boundary and the following are progressively less significant boundaries.
+There can be more than one level of NUMA.
+
+Ex:
+	arm,associativity-reference-points = <0 1>;
+	The board Id(index 0) used first to calculate the associativity (node
+	distance), then follows the  socket id(index 1).
+
+	arm,associativity-reference-points = <1 0>;
+	The socket Id(index 1) used first to calculate the associativity,
+	then follows the board id(index 0).
+
+	arm,associativity-reference-points = <0>;
+	Only the board Id(index 0) used to calculate the associativity.
+
+	arm,associativity-reference-points = <1>;
+	Only socket Id(index 1) used to calculate the associativity.
+
+==============================================================================
+4 - Example dts
+==============================================================================
+
+Example: 2 Node system consists of 2 boards and each board having one socket
+and 8 core in each socket.
+
+	arm,associativity-reference-points = <0>;
+
+	memory@00c00000 {
+		device_type = "memory";
+		reg = <0x0 0x00c00000 0x0 0x80000000>;
+		/* board 0, socket 0, no specific core */
+		arm,associativity = <0 0 0xffff>;
+	};
+
+	memory@10000000000 {
+		device_type = "memory";
+		reg = <0x100 0x00000000 0x0 0x80000000>;
+		/* board 1, socket 0, no specific core */
+		arm,associativity = <1 0 0xffff>;
+	};
+
+	cpus {
+		#address-cells = <2>;
+		#size-cells = <0>;
+
+		cpu@000 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x000>;
+			enable-method = "psci";
+			/* board 0, socket 0, core 0*/
+			arm,associativity = <0 0 0>;
+		};
+		cpu@001 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x001>;
+			enable-method = "psci";
+			arm,associativity = <0 0 1>;
+		};
+		cpu@002 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x002>;
+			enable-method = "psci";
+			arm,associativity = <0 0 2>;
+		};
+		cpu@003 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x003>;
+			enable-method = "psci";
+			arm,associativity = <0 0 3>;
+		};
+		cpu@004 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x004>;
+			enable-method = "psci";
+			arm,associativity = <0 0 4>;
+		};
+		cpu@005 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x005>;
+			enable-method = "psci";
+			arm,associativity = <0 0 5>;
+		};
+		cpu@006 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x006>;
+			enable-method = "psci";
+			arm,associativity = <0 0 6>;
+		};
+		cpu@007 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x007>;
+			enable-method = "psci";
+			arm,associativity = <0 0 7>;
+		};
+		cpu@008 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x008>;
+			enable-method = "psci";
+			/* board 1, socket 0, core 0*/
+			arm,associativity = <1 0 0>;
+		};
+		cpu@009 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x009>;
+			enable-method = "psci";
+			arm,associativity = <1 0 1>;
+		};
+		cpu@00a {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 2>;
+		};
+		cpu@00b {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 3>;
+		};
+		cpu@00c {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 4>;
+		};
+		cpu@00d {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 5>;
+		};
+		cpu@00e {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 6>;
+		};
+		cpu@00f {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 7>;
+		};
+	};
+
+	pcie0: pcie0@0x8480,00000000 {
+		compatible = "arm,armv8";
+		device_type = "pci";
+		bus-range = <0 255>;
+		#size-cells = <2>;
+		#address-cells = <3>;
+		reg = <0x8480 0x00000000 0 0x10000000>;  /* Configuration space */
+		ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000 0x70 0x00000000>; /* mem ranges */
+		/* board 0, socket 0, pci bus 0*/
+		arm,associativity = <0 0 0>;
+        };
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-14 16:39   ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

DT bindings for numa map for memory, cores and IOs using
arm,associativity device node property.

Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
---
 Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
 1 file changed, 212 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/numa.txt

diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
new file mode 100644
index 0000000..dc3ef86
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/numa.txt
@@ -0,0 +1,212 @@
+==============================================================================
+NUMA binding description.
+==============================================================================
+
+==============================================================================
+1 - Introduction
+==============================================================================
+
+Systems employing a Non Uniform Memory Access (NUMA) architecture contain
+collections of hardware resources including processors, memory, and I/O buses,
+that comprise what is commonly known as a NUMA node.
+Processor accesses to memory within the local NUMA node is generally faster
+than processor accesses to memory outside of the local NUMA node.
+DT defines interfaces that allow the platform to convey NUMA node
+topology information to OS.
+
+==============================================================================
+2 - arm,associativity
+==============================================================================
+The mapping is done using arm,associativity device property.
+this property needs to be present in every device node which needs to to be
+mapped to numa nodes.
+
+arm,associativity property is set of 32-bit integers which defines level of
+topology and boundary in the system at which a significant difference in
+performance can be measured between cross-device accesses within
+a single location and those spanning multiple locations.
+The first cell always contains the broadest subdivision within the system,
+while the last cell enumerates the individual devices, such as an SMT thread
+of a CPU, or a bus bridge within an SoC".
+
+ex:
+	/* board 0, socket 0, cluster 0, core 0  thread 0 */
+	arm,associativity = <0 0 0 0 0>;
+
+==============================================================================
+3 - arm,associativity-reference-points
+==============================================================================
+This property is a set of 32-bit integers, each representing an index into
+the arm,associativity nodes. The first integer is the most significant
+NUMA boundary and the following are progressively less significant boundaries.
+There can be more than one level of NUMA.
+
+Ex:
+	arm,associativity-reference-points = <0 1>;
+	The board Id(index 0) used first to calculate the associativity (node
+	distance), then follows the  socket id(index 1).
+
+	arm,associativity-reference-points = <1 0>;
+	The socket Id(index 1) used first to calculate the associativity,
+	then follows the board id(index 0).
+
+	arm,associativity-reference-points = <0>;
+	Only the board Id(index 0) used to calculate the associativity.
+
+	arm,associativity-reference-points = <1>;
+	Only socket Id(index 1) used to calculate the associativity.
+
+==============================================================================
+4 - Example dts
+==============================================================================
+
+Example: 2 Node system consists of 2 boards and each board having one socket
+and 8 core in each socket.
+
+	arm,associativity-reference-points = <0>;
+
+	memory at 00c00000 {
+		device_type = "memory";
+		reg = <0x0 0x00c00000 0x0 0x80000000>;
+		/* board 0, socket 0, no specific core */
+		arm,associativity = <0 0 0xffff>;
+	};
+
+	memory at 10000000000 {
+		device_type = "memory";
+		reg = <0x100 0x00000000 0x0 0x80000000>;
+		/* board 1, socket 0, no specific core */
+		arm,associativity = <1 0 0xffff>;
+	};
+
+	cpus {
+		#address-cells = <2>;
+		#size-cells = <0>;
+
+		cpu at 000 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x000>;
+			enable-method = "psci";
+			/* board 0, socket 0, core 0*/
+			arm,associativity = <0 0 0>;
+		};
+		cpu at 001 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x001>;
+			enable-method = "psci";
+			arm,associativity = <0 0 1>;
+		};
+		cpu at 002 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x002>;
+			enable-method = "psci";
+			arm,associativity = <0 0 2>;
+		};
+		cpu at 003 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x003>;
+			enable-method = "psci";
+			arm,associativity = <0 0 3>;
+		};
+		cpu at 004 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x004>;
+			enable-method = "psci";
+			arm,associativity = <0 0 4>;
+		};
+		cpu at 005 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x005>;
+			enable-method = "psci";
+			arm,associativity = <0 0 5>;
+		};
+		cpu at 006 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x006>;
+			enable-method = "psci";
+			arm,associativity = <0 0 6>;
+		};
+		cpu at 007 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x007>;
+			enable-method = "psci";
+			arm,associativity = <0 0 7>;
+		};
+		cpu at 008 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x008>;
+			enable-method = "psci";
+			/* board 1, socket 0, core 0*/
+			arm,associativity = <1 0 0>;
+		};
+		cpu at 009 {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x009>;
+			enable-method = "psci";
+			arm,associativity = <1 0 1>;
+		};
+		cpu at 00a {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 2>;
+		};
+		cpu at 00b {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 3>;
+		};
+		cpu at 00c {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 4>;
+		};
+		cpu at 00d {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 5>;
+		};
+		cpu at 00e {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 6>;
+		};
+		cpu at 00f {
+			device_type = "cpu";
+			compatible =  "arm,armv8";
+			reg = <0x0 0x00f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 7>;
+		};
+	};
+
+	pcie0: pcie0 at 0x8480,00000000 {
+		compatible = "arm,armv8";
+		device_type = "pci";
+		bus-range = <0 255>;
+		#size-cells = <2>;
+		#address-cells = <3>;
+		reg = <0x8480 0x00000000 0 0x10000000>;  /* Configuration space */
+		ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000 0x70 0x00000000>; /* mem ranges */
+		/* board 0, socket 0, pci bus 0*/
+		arm,associativity = <0 0 0>;
+        };
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v5 3/4] arm64, numa, dt: adding dt based numa support using dt node property arm, associativity
  2015-08-14 16:39 ` Ganapatrao Kulkarni
@ 2015-08-14 16:39   ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:39 UTC (permalink / raw)
  To: linux-arm-kernel, devicetree, Will.Deacon, catalin.marinas,
	grant.likely, leif.lindholm, rfranz, ard.biesheuvel, msalter,
	robh+dt, steve.capper, hanjun.guo, al.stone, arnd, pawel.moll,
	mark.rutland, ijc+devicetree, galak
  Cc: gpkulkarni

Adding dt node pasring for numa topology using property arm,associativity.
arm,associativity property can be used to map memory, cpu and
io devices to numa node.

Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
---
 arch/arm64/Kconfig            |   6 +
 arch/arm64/include/asm/numa.h |   7 +
 arch/arm64/kernel/Makefile    |   1 +
 arch/arm64/kernel/dt_numa.c   | 316 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/smp.c       |   1 +
 arch/arm64/mm/numa.c          |  13 ++
 6 files changed, 344 insertions(+)
 create mode 100644 arch/arm64/kernel/dt_numa.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fa37a5d..ca0f701 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -571,6 +571,12 @@ config NUMA
 	  local memory controller of the CPU and add some more
 	  NUMA awareness to the kernel.
 
+config ARM64_DT_NUMA
+	bool "Device Tree NUMA support"
+	depends on NUMA
+	help
+	  Enable Device Tree NUMA support.
+
 config NODES_SHIFT
 	int "Maximum NUMA Nodes (as a power of 2)"
 	range 1 10
diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h
index 59b834e..40c0997 100644
--- a/arch/arm64/include/asm/numa.h
+++ b/arch/arm64/include/asm/numa.h
@@ -33,10 +33,17 @@ int __init numa_add_memblk(u32 nodeid, u64 start, u64 end);
 void numa_store_cpu_info(int cpu);
 void __init build_cpu_to_node_map(void);
 void __init numa_set_distance(int from, int to, int distance);
+#if defined(CONFIG_ARM64_DT_NUMA)
+void __init dt_numa_set_node_info(u32 cpu, u64 hwid, void *dn);
+#else
+static inline void dt_numa_set_node_info(u32 cpu, u64 hwid, void *dn) { }
+#endif
+int __init arm64_dt_numa_init(void);
 #else	/* CONFIG_NUMA */
 static inline void numa_store_cpu_info(int cpu)		{ }
 static inline void arm64_numa_init(void)		{ }
 static inline void build_cpu_to_node_map(void) { }
 static inline void numa_set_distance(int from, int to, int distance) { }
+static inline void dt_numa_set_node_info(u32 cpu, u64 hwid, void *dn) { }
 #endif	/* CONFIG_NUMA */
 #endif	/* _ASM_NUMA_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 9add37b..bc7954b 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -37,6 +37,7 @@ arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= pci-acpi.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
+arm64-obj-$(CONFIG_ARM64_DT_NUMA)	+= dt_numa.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/dt_numa.c b/arch/arm64/kernel/dt_numa.c
new file mode 100644
index 0000000..02b0a57
--- /dev/null
+++ b/arch/arm64/kernel/dt_numa.c
@@ -0,0 +1,316 @@
+/*
+ * DT NUMA Parsing support, based on the powerpc implementation.
+ *
+ * Copyright (C) 2015 Cavium Inc.
+ * Author: Ganapatrao Kulkarni <gkulkarni@cavium.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/string.h>
+#include <linux/init.h>
+#include <linux/bootmem.h>
+#include <linux/memblock.h>
+#include <linux/ctype.h>
+#include <linux/module.h>
+#include <linux/nodemask.h>
+#include <linux/sched.h>
+#include <linux/of.h>
+#include <linux/of_fdt.h>
+#include <asm/smp_plat.h>
+
+#define MAX_DISTANCE_REF_POINTS 8
+static int min_common_depth;
+static int distance_ref_points_depth;
+static const __be32 *distance_ref_points;
+static int distance_lookup_table[MAX_NUMNODES][MAX_DISTANCE_REF_POINTS];
+static int default_nid;
+static int of_node_to_nid_single(struct device_node *device);
+static struct device_node *of_cpu_to_node(int cpu);
+
+static void initialize_distance_lookup_table(int nid,
+		const __be32 *associativity)
+{
+	int i;
+
+	for (i = 0; i < distance_ref_points_depth; i++) {
+		const __be32 *entry;
+
+		entry = &associativity[be32_to_cpu(distance_ref_points[i])];
+		distance_lookup_table[nid][i] = of_read_number(entry, 1);
+	}
+}
+
+/* must hold reference to node during call */
+static const __be32 *of_get_associativity(struct device_node *dev)
+{
+	return of_get_property(dev, "arm,associativity", NULL);
+}
+
+/* Returns nid in the range [0..MAX_NUMNODES-1], or -1 if no useful numa
+ * info is found.
+ */
+static int associativity_to_nid(const __be32 *associativity)
+{
+	int nid = NUMA_NO_NODE;
+
+	if (min_common_depth == -1)
+		goto out;
+
+	if (of_read_number(associativity, 1) >= min_common_depth)
+		nid = of_read_number(&associativity[min_common_depth], 1);
+
+	/* set 0xffff as invalid node */
+	if (nid == 0xffff || nid >= MAX_NUMNODES)
+		nid = NUMA_NO_NODE;
+
+	if (nid != NUMA_NO_NODE)
+		initialize_distance_lookup_table(nid, associativity);
+out:
+	return nid;
+}
+
+/* Returns the nid associated with the given device tree node,
+ * or -1 if not found.
+ */
+static int of_node_to_nid_single(struct device_node *device)
+{
+	int nid = default_nid;
+	const __be32 *tmp;
+
+	tmp = of_get_associativity(device);
+	if (tmp)
+		nid = associativity_to_nid(tmp);
+	return nid;
+}
+
+/* Walk the device tree upwards, looking for an associativity id */
+int of_node_to_nid(struct device_node *device)
+{
+	struct device_node *tmp;
+	int nid = NUMA_NO_NODE;
+
+	of_node_get(device);
+	while (device) {
+		nid = of_node_to_nid_single(device);
+		if (nid != NUMA_NO_NODE)
+			break;
+
+		tmp = device;
+		device = of_get_parent(tmp);
+		of_node_put(tmp);
+	}
+	of_node_put(device);
+
+	return nid;
+}
+
+static int __init find_min_common_depth(unsigned long node)
+{
+	int depth;
+	const __be32 *numa_prop;
+	int nr_address_cells;
+
+	/*
+	 * This property is a set of 32-bit integers, each representing
+	 * an index into the arm,associativity nodes.
+	 *
+	 * With form 1 affinity the first integer is the most significant
+	 * NUMA boundary and the following are progressively less significant
+	 * boundaries. There can be more than one level of NUMA.
+	 */
+
+	distance_ref_points = of_get_flat_dt_prop(node,
+			"arm,associativity-reference-points",
+			&distance_ref_points_depth);
+	numa_prop = distance_ref_points;
+
+	if (numa_prop) {
+		nr_address_cells = dt_mem_next_cell(
+				OF_ROOT_NODE_ADDR_CELLS_DEFAULT, &numa_prop);
+		nr_address_cells = dt_mem_next_cell(
+				OF_ROOT_NODE_ADDR_CELLS_DEFAULT, &numa_prop);
+	}
+	if (!distance_ref_points) {
+		pr_debug("NUMA: arm,associativity-reference-points not found.\n");
+		goto err;
+	}
+
+	distance_ref_points_depth /= sizeof(__be32);
+
+	if (!distance_ref_points_depth) {
+		pr_err("NUMA: missing arm,associativity-reference-points\n");
+		goto err;
+	}
+	depth = of_read_number(distance_ref_points, 1);
+
+	/*
+	 * Warn and cap if the hardware supports more than
+	 * MAX_DISTANCE_REF_POINTS domains.
+	 */
+	if (distance_ref_points_depth > MAX_DISTANCE_REF_POINTS) {
+		pr_debug("NUMA: distance array capped at %d entries\n",
+				MAX_DISTANCE_REF_POINTS);
+		distance_ref_points_depth = MAX_DISTANCE_REF_POINTS;
+	}
+
+	return depth;
+err:
+	return -1;
+}
+
+void __init dt_numa_set_node_info(u32 cpu, u64 hwid,  void *dn_ptr)
+{
+	struct device_node *dn = (struct device_node *) dn_ptr;
+	int nid = default_nid;
+
+	if (dn)
+		nid = of_node_to_nid_single(dn);
+
+	node_cpu_hwid[cpu].node_id = nid;
+	node_cpu_hwid[cpu].cpu_hwid = hwid;
+	node_set(nid, numa_nodes_parsed);
+}
+
+int dt_get_cpu_node_id(int cpu)
+{
+	struct device_node *dn = NULL;
+	int nid = default_nid;
+
+	dn =  of_cpu_to_node(cpu);
+	if (dn)
+		nid = of_node_to_nid_single(dn);
+	return nid;
+}
+
+static struct device_node *of_cpu_to_node(int cpu)
+{
+	struct device_node *dn = NULL;
+
+	while ((dn = of_find_node_by_type(dn, "cpu"))) {
+		const u32 *cell;
+		u64 hwid;
+
+		/*
+		 * A cpu node with missing "reg" property is
+		 * considered invalid to build a cpu_logical_map
+		 * entry.
+		 */
+		cell = of_get_property(dn, "reg", NULL);
+		if (!cell) {
+			pr_err("%s: missing reg property\n", dn->full_name);
+			return NULL;
+		}
+		hwid = of_read_number(cell, of_n_addr_cells(dn));
+
+		if (cpu_logical_map(cpu) == hwid)
+			return dn;
+	}
+	return NULL;
+}
+
+static int __init parse_memory_node(unsigned long node)
+{
+	const __be32 *reg, *endp, *associativity;
+	int length;
+	int nid = default_nid;
+
+	associativity = of_get_flat_dt_prop(node, "arm,associativity", &length);
+
+	if (associativity)
+		nid = associativity_to_nid(associativity);
+
+	reg = of_get_flat_dt_prop(node, "reg", &length);
+	endp = reg + (length / sizeof(__be32));
+
+	while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) {
+		u64 base, size;
+		struct memblock_region *mblk;
+
+		base = dt_mem_next_cell(dt_root_addr_cells, &reg);
+		size = dt_mem_next_cell(dt_root_size_cells, &reg);
+		pr_debug("NUMA-DT:  base = %llx , node = %u\n",
+				base, nid);
+		for_each_memblock(memory, mblk) {
+			if (mblk->base == base) {
+				node_set(nid, numa_nodes_parsed);
+				numa_add_memblk(nid, mblk->base, mblk->size);
+				break;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * early_init_dt_scan_numa_map - parse memory node and map nid to memory range.
+ */
+int __init early_init_dt_scan_numa_map(unsigned long node, const char *uname,
+				     int depth, void *data)
+{
+	const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
+
+	if (depth == 0) {
+		min_common_depth = find_min_common_depth(node);
+		if (min_common_depth < 0)
+			return min_common_depth;
+		pr_debug("NUMA associativity depth for CPU/Memory: %d\n",
+				min_common_depth);
+		return 0;
+	}
+
+	if (type) {
+		if (strcmp(type, "memory") == 0)
+			parse_memory_node(node);
+	}
+	return 0;
+}
+
+int dt_get_node_distance(int a, int b)
+{
+	int i;
+	int distance = LOCAL_DISTANCE;
+
+	for (i = 0; i < distance_ref_points_depth; i++) {
+		if (distance_lookup_table[a][i] == distance_lookup_table[b][i])
+			break;
+
+		/* Double the distance for each NUMA level */
+		distance *= 2;
+	}
+	return distance;
+}
+
+/* DT node mapping is done already early_init_dt_scan_memory */
+int __init arm64_dt_numa_init(void)
+{
+	int i;
+	u32 nodea, nodeb, distance, node_count = 0;
+
+	of_scan_flat_dt(early_init_dt_scan_numa_map, NULL);
+
+	for_each_node_mask(i, numa_nodes_parsed)
+		node_count = i;
+	node_count++;
+
+	for (nodea =  0; nodea < node_count; nodea++) {
+		for (nodeb = 0; nodeb < node_count; nodeb++) {
+			distance = dt_get_node_distance(nodea, nodeb);
+			numa_set_distance(nodea, nodeb, distance);
+		}
+	}
+	return 0;
+}
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index ae3e02c..bdf0358 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -503,6 +503,7 @@ void __init of_parse_and_init_cpus(void)
 
 		pr_debug("cpu logical map 0x%llx\n", hwid);
 		cpu_logical_map(cpu_count) = hwid;
+		dt_numa_set_node_info(cpu_count, hwid, (void *)dn);
 next:
 		cpu_count++;
 	}
diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
index 2be83de..0c879eb 100644
--- a/arch/arm64/mm/numa.c
+++ b/arch/arm64/mm/numa.c
@@ -28,6 +28,7 @@
 #include <linux/nodemask.h>
 #include <linux/sched.h>
 #include <linux/topology.h>
+#include <linux/of.h>
 #include <linux/mmzone.h>
 
 #include <asm/smp_plat.h>
@@ -546,5 +547,17 @@ static int __init dummy_numa_init(void)
  */
 void __init arm64_numa_init(void)
 {
+	int (*init_func)(void);
+
+	if (IS_ENABLED(CONFIG_ARM64_DT_NUMA))
+		init_func = arm64_dt_numa_init;
+	else
+		init_func = NULL;
+
+	if (!numa_off && init_func) {
+		if (!numa_init(init_func))
+			return;
+	}
+
 	numa_init(dummy_numa_init);
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v5 3/4] arm64, numa, dt: adding dt based numa support using dt node property arm, associativity
@ 2015-08-14 16:39   ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

Adding dt node pasring for numa topology using property arm,associativity.
arm,associativity property can be used to map memory, cpu and
io devices to numa node.

Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
---
 arch/arm64/Kconfig            |   6 +
 arch/arm64/include/asm/numa.h |   7 +
 arch/arm64/kernel/Makefile    |   1 +
 arch/arm64/kernel/dt_numa.c   | 316 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/smp.c       |   1 +
 arch/arm64/mm/numa.c          |  13 ++
 6 files changed, 344 insertions(+)
 create mode 100644 arch/arm64/kernel/dt_numa.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fa37a5d..ca0f701 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -571,6 +571,12 @@ config NUMA
 	  local memory controller of the CPU and add some more
 	  NUMA awareness to the kernel.
 
+config ARM64_DT_NUMA
+	bool "Device Tree NUMA support"
+	depends on NUMA
+	help
+	  Enable Device Tree NUMA support.
+
 config NODES_SHIFT
 	int "Maximum NUMA Nodes (as a power of 2)"
 	range 1 10
diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h
index 59b834e..40c0997 100644
--- a/arch/arm64/include/asm/numa.h
+++ b/arch/arm64/include/asm/numa.h
@@ -33,10 +33,17 @@ int __init numa_add_memblk(u32 nodeid, u64 start, u64 end);
 void numa_store_cpu_info(int cpu);
 void __init build_cpu_to_node_map(void);
 void __init numa_set_distance(int from, int to, int distance);
+#if defined(CONFIG_ARM64_DT_NUMA)
+void __init dt_numa_set_node_info(u32 cpu, u64 hwid, void *dn);
+#else
+static inline void dt_numa_set_node_info(u32 cpu, u64 hwid, void *dn) { }
+#endif
+int __init arm64_dt_numa_init(void);
 #else	/* CONFIG_NUMA */
 static inline void numa_store_cpu_info(int cpu)		{ }
 static inline void arm64_numa_init(void)		{ }
 static inline void build_cpu_to_node_map(void) { }
 static inline void numa_set_distance(int from, int to, int distance) { }
+static inline void dt_numa_set_node_info(u32 cpu, u64 hwid, void *dn) { }
 #endif	/* CONFIG_NUMA */
 #endif	/* _ASM_NUMA_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 9add37b..bc7954b 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -37,6 +37,7 @@ arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= pci-acpi.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
+arm64-obj-$(CONFIG_ARM64_DT_NUMA)	+= dt_numa.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/dt_numa.c b/arch/arm64/kernel/dt_numa.c
new file mode 100644
index 0000000..02b0a57
--- /dev/null
+++ b/arch/arm64/kernel/dt_numa.c
@@ -0,0 +1,316 @@
+/*
+ * DT NUMA Parsing support, based on the powerpc implementation.
+ *
+ * Copyright (C) 2015 Cavium Inc.
+ * Author: Ganapatrao Kulkarni <gkulkarni@cavium.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/string.h>
+#include <linux/init.h>
+#include <linux/bootmem.h>
+#include <linux/memblock.h>
+#include <linux/ctype.h>
+#include <linux/module.h>
+#include <linux/nodemask.h>
+#include <linux/sched.h>
+#include <linux/of.h>
+#include <linux/of_fdt.h>
+#include <asm/smp_plat.h>
+
+#define MAX_DISTANCE_REF_POINTS 8
+static int min_common_depth;
+static int distance_ref_points_depth;
+static const __be32 *distance_ref_points;
+static int distance_lookup_table[MAX_NUMNODES][MAX_DISTANCE_REF_POINTS];
+static int default_nid;
+static int of_node_to_nid_single(struct device_node *device);
+static struct device_node *of_cpu_to_node(int cpu);
+
+static void initialize_distance_lookup_table(int nid,
+		const __be32 *associativity)
+{
+	int i;
+
+	for (i = 0; i < distance_ref_points_depth; i++) {
+		const __be32 *entry;
+
+		entry = &associativity[be32_to_cpu(distance_ref_points[i])];
+		distance_lookup_table[nid][i] = of_read_number(entry, 1);
+	}
+}
+
+/* must hold reference to node during call */
+static const __be32 *of_get_associativity(struct device_node *dev)
+{
+	return of_get_property(dev, "arm,associativity", NULL);
+}
+
+/* Returns nid in the range [0..MAX_NUMNODES-1], or -1 if no useful numa
+ * info is found.
+ */
+static int associativity_to_nid(const __be32 *associativity)
+{
+	int nid = NUMA_NO_NODE;
+
+	if (min_common_depth == -1)
+		goto out;
+
+	if (of_read_number(associativity, 1) >= min_common_depth)
+		nid = of_read_number(&associativity[min_common_depth], 1);
+
+	/* set 0xffff as invalid node */
+	if (nid == 0xffff || nid >= MAX_NUMNODES)
+		nid = NUMA_NO_NODE;
+
+	if (nid != NUMA_NO_NODE)
+		initialize_distance_lookup_table(nid, associativity);
+out:
+	return nid;
+}
+
+/* Returns the nid associated with the given device tree node,
+ * or -1 if not found.
+ */
+static int of_node_to_nid_single(struct device_node *device)
+{
+	int nid = default_nid;
+	const __be32 *tmp;
+
+	tmp = of_get_associativity(device);
+	if (tmp)
+		nid = associativity_to_nid(tmp);
+	return nid;
+}
+
+/* Walk the device tree upwards, looking for an associativity id */
+int of_node_to_nid(struct device_node *device)
+{
+	struct device_node *tmp;
+	int nid = NUMA_NO_NODE;
+
+	of_node_get(device);
+	while (device) {
+		nid = of_node_to_nid_single(device);
+		if (nid != NUMA_NO_NODE)
+			break;
+
+		tmp = device;
+		device = of_get_parent(tmp);
+		of_node_put(tmp);
+	}
+	of_node_put(device);
+
+	return nid;
+}
+
+static int __init find_min_common_depth(unsigned long node)
+{
+	int depth;
+	const __be32 *numa_prop;
+	int nr_address_cells;
+
+	/*
+	 * This property is a set of 32-bit integers, each representing
+	 * an index into the arm,associativity nodes.
+	 *
+	 * With form 1 affinity the first integer is the most significant
+	 * NUMA boundary and the following are progressively less significant
+	 * boundaries. There can be more than one level of NUMA.
+	 */
+
+	distance_ref_points = of_get_flat_dt_prop(node,
+			"arm,associativity-reference-points",
+			&distance_ref_points_depth);
+	numa_prop = distance_ref_points;
+
+	if (numa_prop) {
+		nr_address_cells = dt_mem_next_cell(
+				OF_ROOT_NODE_ADDR_CELLS_DEFAULT, &numa_prop);
+		nr_address_cells = dt_mem_next_cell(
+				OF_ROOT_NODE_ADDR_CELLS_DEFAULT, &numa_prop);
+	}
+	if (!distance_ref_points) {
+		pr_debug("NUMA: arm,associativity-reference-points not found.\n");
+		goto err;
+	}
+
+	distance_ref_points_depth /= sizeof(__be32);
+
+	if (!distance_ref_points_depth) {
+		pr_err("NUMA: missing arm,associativity-reference-points\n");
+		goto err;
+	}
+	depth = of_read_number(distance_ref_points, 1);
+
+	/*
+	 * Warn and cap if the hardware supports more than
+	 * MAX_DISTANCE_REF_POINTS domains.
+	 */
+	if (distance_ref_points_depth > MAX_DISTANCE_REF_POINTS) {
+		pr_debug("NUMA: distance array capped at %d entries\n",
+				MAX_DISTANCE_REF_POINTS);
+		distance_ref_points_depth = MAX_DISTANCE_REF_POINTS;
+	}
+
+	return depth;
+err:
+	return -1;
+}
+
+void __init dt_numa_set_node_info(u32 cpu, u64 hwid,  void *dn_ptr)
+{
+	struct device_node *dn = (struct device_node *) dn_ptr;
+	int nid = default_nid;
+
+	if (dn)
+		nid = of_node_to_nid_single(dn);
+
+	node_cpu_hwid[cpu].node_id = nid;
+	node_cpu_hwid[cpu].cpu_hwid = hwid;
+	node_set(nid, numa_nodes_parsed);
+}
+
+int dt_get_cpu_node_id(int cpu)
+{
+	struct device_node *dn = NULL;
+	int nid = default_nid;
+
+	dn =  of_cpu_to_node(cpu);
+	if (dn)
+		nid = of_node_to_nid_single(dn);
+	return nid;
+}
+
+static struct device_node *of_cpu_to_node(int cpu)
+{
+	struct device_node *dn = NULL;
+
+	while ((dn = of_find_node_by_type(dn, "cpu"))) {
+		const u32 *cell;
+		u64 hwid;
+
+		/*
+		 * A cpu node with missing "reg" property is
+		 * considered invalid to build a cpu_logical_map
+		 * entry.
+		 */
+		cell = of_get_property(dn, "reg", NULL);
+		if (!cell) {
+			pr_err("%s: missing reg property\n", dn->full_name);
+			return NULL;
+		}
+		hwid = of_read_number(cell, of_n_addr_cells(dn));
+
+		if (cpu_logical_map(cpu) == hwid)
+			return dn;
+	}
+	return NULL;
+}
+
+static int __init parse_memory_node(unsigned long node)
+{
+	const __be32 *reg, *endp, *associativity;
+	int length;
+	int nid = default_nid;
+
+	associativity = of_get_flat_dt_prop(node, "arm,associativity", &length);
+
+	if (associativity)
+		nid = associativity_to_nid(associativity);
+
+	reg = of_get_flat_dt_prop(node, "reg", &length);
+	endp = reg + (length / sizeof(__be32));
+
+	while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) {
+		u64 base, size;
+		struct memblock_region *mblk;
+
+		base = dt_mem_next_cell(dt_root_addr_cells, &reg);
+		size = dt_mem_next_cell(dt_root_size_cells, &reg);
+		pr_debug("NUMA-DT:  base = %llx , node = %u\n",
+				base, nid);
+		for_each_memblock(memory, mblk) {
+			if (mblk->base == base) {
+				node_set(nid, numa_nodes_parsed);
+				numa_add_memblk(nid, mblk->base, mblk->size);
+				break;
+			}
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * early_init_dt_scan_numa_map - parse memory node and map nid to memory range.
+ */
+int __init early_init_dt_scan_numa_map(unsigned long node, const char *uname,
+				     int depth, void *data)
+{
+	const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
+
+	if (depth == 0) {
+		min_common_depth = find_min_common_depth(node);
+		if (min_common_depth < 0)
+			return min_common_depth;
+		pr_debug("NUMA associativity depth for CPU/Memory: %d\n",
+				min_common_depth);
+		return 0;
+	}
+
+	if (type) {
+		if (strcmp(type, "memory") == 0)
+			parse_memory_node(node);
+	}
+	return 0;
+}
+
+int dt_get_node_distance(int a, int b)
+{
+	int i;
+	int distance = LOCAL_DISTANCE;
+
+	for (i = 0; i < distance_ref_points_depth; i++) {
+		if (distance_lookup_table[a][i] == distance_lookup_table[b][i])
+			break;
+
+		/* Double the distance for each NUMA level */
+		distance *= 2;
+	}
+	return distance;
+}
+
+/* DT node mapping is done already early_init_dt_scan_memory */
+int __init arm64_dt_numa_init(void)
+{
+	int i;
+	u32 nodea, nodeb, distance, node_count = 0;
+
+	of_scan_flat_dt(early_init_dt_scan_numa_map, NULL);
+
+	for_each_node_mask(i, numa_nodes_parsed)
+		node_count = i;
+	node_count++;
+
+	for (nodea =  0; nodea < node_count; nodea++) {
+		for (nodeb = 0; nodeb < node_count; nodeb++) {
+			distance = dt_get_node_distance(nodea, nodeb);
+			numa_set_distance(nodea, nodeb, distance);
+		}
+	}
+	return 0;
+}
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index ae3e02c..bdf0358 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -503,6 +503,7 @@ void __init of_parse_and_init_cpus(void)
 
 		pr_debug("cpu logical map 0x%llx\n", hwid);
 		cpu_logical_map(cpu_count) = hwid;
+		dt_numa_set_node_info(cpu_count, hwid, (void *)dn);
 next:
 		cpu_count++;
 	}
diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
index 2be83de..0c879eb 100644
--- a/arch/arm64/mm/numa.c
+++ b/arch/arm64/mm/numa.c
@@ -28,6 +28,7 @@
 #include <linux/nodemask.h>
 #include <linux/sched.h>
 #include <linux/topology.h>
+#include <linux/of.h>
 #include <linux/mmzone.h>
 
 #include <asm/smp_plat.h>
@@ -546,5 +547,17 @@ static int __init dummy_numa_init(void)
  */
 void __init arm64_numa_init(void)
 {
+	int (*init_func)(void);
+
+	if (IS_ENABLED(CONFIG_ARM64_DT_NUMA))
+		init_func = arm64_dt_numa_init;
+	else
+		init_func = NULL;
+
+	if (!numa_off && init_func) {
+		if (!numa_init(init_func))
+			return;
+	}
+
 	numa_init(dummy_numa_init);
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v5 4/4] arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node topology.
  2015-08-14 16:39 ` Ganapatrao Kulkarni
@ 2015-08-14 16:39   ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:39 UTC (permalink / raw)
  To: linux-arm-kernel, devicetree, Will.Deacon, catalin.marinas,
	grant.likely, leif.lindholm, rfranz, ard.biesheuvel, msalter,
	robh+dt, steve.capper, hanjun.guo, al.stone, arnd, pawel.moll,
	mark.rutland, ijc+devicetree, galak
  Cc: gpkulkarni

adding dt file for Cavium's Thunder SoC in 2 Node topology
using arm,associativity device node property.

Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
---
 arch/arm64/boot/dts/cavium/Makefile             |   2 +-
 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
 3 files changed, 869 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
 create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi

diff --git a/arch/arm64/boot/dts/cavium/Makefile b/arch/arm64/boot/dts/cavium/Makefile
index e34f89d..7fe7067 100644
--- a/arch/arm64/boot/dts/cavium/Makefile
+++ b/arch/arm64/boot/dts/cavium/Makefile
@@ -1,4 +1,4 @@
-dtb-$(CONFIG_ARCH_THUNDER) += thunder-88xx.dtb
+dtb-$(CONFIG_ARCH_THUNDER) += thunder-88xx.dtb thunder-88xx-2n.dtb
 
 always		:= $(dtb-y)
 subdir-y	:= $(dts-dirs)
diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
new file mode 100644
index 0000000..adbd3a9
--- /dev/null
+++ b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
@@ -0,0 +1,78 @@
+/*
+ * Cavium Thunder DTS file - Thunder board description
+ *
+ * Copyright (C) 2014, Cavium Inc.
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This library is free software; you can redistribute it and/or
+ *     modify it under the terms of the GNU General Public License as
+ *     published by the Free Software Foundation; either version 2 of the
+ *     License, or (at your option) any later version.
+ *
+ *     This library is distributed in the hope that it will be useful,
+ *     but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *     GNU General Public License for more details.
+ *
+ *     You should have received a copy of the GNU General Public
+ *     License along with this library; if not, write to the Free
+ *     Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ *     MA 02110-1301 USA
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ *     obtaining a copy of this software and associated documentation
+ *     files (the "Software"), to deal in the Software without
+ *     restriction, including without limitation the rights to use,
+ *     copy, modify, merge, publish, distribute, sublicense, and/or
+ *     sell copies of the Software, and to permit persons to whom the
+ *     Software is furnished to do so, subject to the following
+ *     conditions:
+ *
+ *     The above copyright notice and this permission notice shall be
+ *     included in all copies or substantial portions of the Software.
+ *
+ *     THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ *     EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ *     OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ *     NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ *     HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ *     WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ *     FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ *     OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/dts-v1/;
+
+/include/ "thunder-88xx-2n.dtsi"
+
+/ {
+	model = "Cavium ThunderX CN88XX board";
+	compatible = "cavium,thunder-88xx";
+	arm,associativity-reference-points = <0>;
+
+	aliases {
+		serial0 = &uaa0;
+		serial1 = &uaa1;
+	};
+
+	memory@00000000 {
+		device_type = "memory";
+		reg = <0x0 0x00000000 0x0 0x80000000>;
+		/* socket 0, no specific cluster, core */
+		arm,associativity = <0 0xffff 0xffff>;
+	};
+
+	memory@10000000000 {
+		device_type = "memory";
+		reg = <0x100 0x00000000 0x0 0x80000000>;
+		/* socket 1, no specific cluster, core */
+		arm,associativity = <1 0xffff 0xffff>;
+	};
+
+};
diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
new file mode 100644
index 0000000..dc6f8ea
--- /dev/null
+++ b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
@@ -0,0 +1,790 @@
+/*
+ * Cavium Thunder DTS file - Thunder SoC description
+ *
+ * Copyright (C) 2014, Cavium Inc.
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This library is free software; you can redistribute it and/or
+ *     modify it under the terms of the GNU General Public License as
+ *     published by the Free Software Foundation; either version 2 of the
+ *     License, or (at your option) any later version.
+ *
+ *     This library is distributed in the hope that it will be useful,
+ *     but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *     GNU General Public License for more details.
+ *
+ *     You should have received a copy of the GNU General Public
+ *     License along with this library; if not, write to the Free
+ *     Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ *     MA 02110-1301 USA
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ *     obtaining a copy of this software and associated documentation
+ *     files (the "Software"), to deal in the Software without
+ *     restriction, including without limitation the rights to use,
+ *     copy, modify, merge, publish, distribute, sublicense, and/or
+ *     sell copies of the Software, and to permit persons to whom the
+ *     Software is furnished to do so, subject to the following
+ *     conditions:
+ *
+ *     The above copyright notice and this permission notice shall be
+ *     included in all copies or substantial portions of the Software.
+ *
+ *     THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ *     EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ *     OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ *     NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ *     HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ *     WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ *     FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ *     OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/ {
+	compatible = "cavium,thunder-88xx";
+	interrupt-parent = <&gic0>;
+	#address-cells = <2>;
+	#size-cells = <2>;
+
+	psci {
+		compatible = "arm,psci-0.2";
+		method = "smc";
+	};
+
+	cpus {
+		#address-cells = <2>;
+		#size-cells = <0>;
+
+		cpu@000 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x000>;
+			enable-method = "psci";
+			/* socket 0, cluster 0, core 0*/
+			arm,associativity = <0 0 0>;
+		};
+		cpu@001 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x001>;
+			enable-method = "psci";
+			arm,associativity = <0 0 1>;
+		};
+		cpu@002 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x002>;
+			enable-method = "psci";
+			arm,associativity = <0 0 2>;
+		};
+		cpu@003 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x003>;
+			enable-method = "psci";
+			arm,associativity = <0 0 3>;
+		};
+		cpu@004 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x004>;
+			enable-method = "psci";
+			arm,associativity = <0 0 4>;
+		};
+		cpu@005 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x005>;
+			enable-method = "psci";
+			arm,associativity = <0 0 5>;
+		};
+		cpu@006 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x006>;
+			enable-method = "psci";
+			arm,associativity = <0 0 6>;
+		};
+		cpu@007 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x007>;
+			enable-method = "psci";
+			arm,associativity = <0 0 7>;
+		};
+		cpu@008 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x008>;
+			enable-method = "psci";
+			arm,associativity = <0 0 8>;
+		};
+		cpu@009 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x009>;
+			enable-method = "psci";
+			arm,associativity = <0 0 9>;
+		};
+		cpu@00a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 10>;
+		};
+		cpu@00b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00b>;
+			enable-method = "psci";
+			arm,associativity = <0 0 11>;
+		};
+		cpu@00c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00c>;
+			enable-method = "psci";
+			arm,associativity = <0 0 12>;
+		};
+		cpu@00d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00d>;
+			enable-method = "psci";
+			arm,associativity = <0 0 13>;
+		};
+		cpu@00e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00e>;
+			enable-method = "psci";
+			arm,associativity = <0 0 14>;
+		};
+		cpu@00f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00f>;
+			enable-method = "psci";
+			arm,associativity = <0 0 15>;
+		};
+		cpu@100 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x100>;
+			enable-method = "psci";
+			arm,associativity = <0 1 0>;
+		};
+		cpu@101 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x101>;
+			enable-method = "psci";
+			arm,associativity = <0 1 1>;
+		};
+		cpu@102 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x102>;
+			enable-method = "psci";
+			arm,associativity = <0 1 2>;
+		};
+		cpu@103 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x103>;
+			enable-method = "psci";
+			arm,associativity = <0 1 3>;
+		};
+		cpu@104 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x104>;
+			enable-method = "psci";
+			arm,associativity = <0 1 4>;
+		};
+		cpu@105 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x105>;
+			enable-method = "psci";
+			arm,associativity = <0 1 5>;
+		};
+		cpu@106 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x106>;
+			enable-method = "psci";
+			arm,associativity = <0 1 6>;
+		};
+		cpu@107 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x107>;
+			enable-method = "psci";
+			arm,associativity = <0 1 7>;
+		};
+		cpu@108 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x108>;
+			enable-method = "psci";
+			arm,associativity = <0 1 8>;
+		};
+		cpu@109 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x109>;
+			enable-method = "psci";
+			arm,associativity = <0 1 9>;
+		};
+		cpu@10a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10a>;
+			enable-method = "psci";
+			arm,associativity = <0 1 10>;
+		};
+		cpu@10b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10b>;
+			enable-method = "psci";
+			arm,associativity = <0 1 11>;
+		};
+		cpu@10c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10c>;
+			enable-method = "psci";
+			arm,associativity = <0 1 12>;
+		};
+		cpu@10d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10d>;
+			enable-method = "psci";
+			arm,associativity = <0 1 13>;
+		};
+		cpu@10e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10e>;
+			enable-method = "psci";
+			arm,associativity = <0 1 14>;
+		};
+		cpu@10f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10f>;
+			enable-method = "psci";
+			arm,associativity = <0 1 15>;
+		};
+		cpu@200 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x200>;
+			enable-method = "psci";
+			arm,associativity = <0 2 0>;
+		};
+		cpu@201 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x201>;
+			enable-method = "psci";
+			arm,associativity = <0 2 1>;
+		};
+		cpu@202 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x202>;
+			enable-method = "psci";
+			arm,associativity = <0 2 2>;
+		};
+		cpu@203 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x203>;
+			enable-method = "psci";
+			arm,associativity = <0 2 3>;
+		};
+		cpu@204 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x204>;
+			enable-method = "psci";
+			arm,associativity = <0 2 4>;
+		};
+		cpu@205 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x205>;
+			enable-method = "psci";
+			arm,associativity = <0 2 5>;
+		};
+		cpu@206 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x206>;
+			enable-method = "psci";
+			arm,associativity = <0 2 6>;
+		};
+		cpu@207 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x207>;
+			enable-method = "psci";
+			arm,associativity = <0 2 7>;
+		};
+		cpu@208 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x208>;
+			enable-method = "psci";
+			arm,associativity = <0 2 8>;
+		};
+		cpu@209 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x209>;
+			enable-method = "psci";
+			arm,associativity = <0 2 9>;
+		};
+		cpu@20a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20a>;
+			enable-method = "psci";
+			arm,associativity = <0 2 10>;
+		};
+		cpu@20b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20b>;
+			enable-method = "psci";
+			arm,associativity = <0 2 11>;
+		};
+		cpu@20c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20c>;
+			enable-method = "psci";
+			arm,associativity = <0 2 12>;
+		};
+		cpu@20d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20d>;
+			enable-method = "psci";
+			arm,associativity = <0 2 13>;
+		};
+		cpu@20e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20e>;
+			enable-method = "psci";
+			arm,associativity = <0 2 14>;
+		};
+		cpu@20f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20f>;
+			enable-method = "psci";
+			arm,associativity = <0 2 15>;
+		};
+		cpu@10000 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10000>;
+			enable-method = "psci";
+			/* socket 1, cluster 0, core 0*/
+			arm,associativity = <1 0 0>;
+		};
+		cpu@10001 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10001>;
+			enable-method = "psci";
+			arm,associativity = <1 0 1>;
+		};
+		cpu@10002 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10002>;
+			enable-method = "psci";
+			arm,associativity = <1 0 2>;
+		};
+		cpu@10003 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10003>;
+			enable-method = "psci";
+			arm,associativity = <1 0 3>;
+		};
+		cpu@10004 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10004>;
+			enable-method = "psci";
+			arm,associativity = <1 0 4>;
+		};
+		cpu@10005 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10005>;
+			enable-method = "psci";
+			arm,associativity = <1 0 5>;
+		};
+		cpu@10006 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10006>;
+			enable-method = "psci";
+			arm,associativity = <1 0 6>;
+		};
+		cpu@10007 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10007>;
+			enable-method = "psci";
+			arm,associativity = <1 0 7>;
+		};
+		cpu@10008 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10008>;
+			enable-method = "psci";
+			arm,associativity = <1 0 8>;
+		};
+		cpu@10009 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10009>;
+			enable-method = "psci";
+			arm,associativity = <1 0 9>;
+		};
+		cpu@1000a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000a>;
+			enable-method = "psci";
+			arm,associativity = <1 0 10>;
+		};
+		cpu@1000b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 11>;
+		};
+		cpu@1000c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 12>;
+		};
+		cpu@1000d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 13>;
+		};
+		cpu@1000e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 14>;
+		};
+		cpu@1000f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 15>;
+		};
+		cpu@10100 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10100>;
+			enable-method = "psci";
+			arm,associativity = <1 1 0>;
+		};
+		cpu@10101 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10101>;
+			enable-method = "psci";
+			arm,associativity = <1 1 1>;
+		};
+		cpu@10102 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10102>;
+			enable-method = "psci";
+			arm,associativity = <1 1 2>;
+		};
+		cpu@10103 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10103>;
+			enable-method = "psci";
+			arm,associativity = <1 1 3>;
+		};
+		cpu@10104 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10104>;
+			enable-method = "psci";
+			arm,associativity = <1 1 4>;
+		};
+		cpu@10105 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10105>;
+			enable-method = "psci";
+			arm,associativity = <1 1 5>;
+		};
+		cpu@10106 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10106>;
+			enable-method = "psci";
+			arm,associativity = <1 1 6>;
+		};
+		cpu@10107 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10107>;
+			enable-method = "psci";
+			arm,associativity = <1 1 7>;
+		};
+		cpu@10108 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10108>;
+			enable-method = "psci";
+			arm,associativity = <1 1 8>;
+		};
+		cpu@10109 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10109>;
+			enable-method = "psci";
+			arm,associativity = <1 1 9>;
+		};
+		cpu@1010a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010a>;
+			enable-method = "psci";
+			arm,associativity = <1 1 10>;
+		};
+		cpu@1010b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010b>;
+			enable-method = "psci";
+			arm,associativity = <1 1 11>;
+		};
+		cpu@1010c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010c>;
+			enable-method = "psci";
+			arm,associativity = <1 1 12>;
+		};
+		cpu@1010d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010d>;
+			enable-method = "psci";
+			arm,associativity = <1 1 13>;
+		};
+		cpu@1010e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010e>;
+			enable-method = "psci";
+			arm,associativity = <1 1 14>;
+		};
+		cpu@1010f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010f>;
+			enable-method = "psci";
+			arm,associativity = <1 1 15>;
+		};
+		cpu@10200 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10200>;
+			enable-method = "psci";
+			arm,associativity = <1 2 0>;
+		};
+		cpu@10201 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10201>;
+			enable-method = "psci";
+			arm,associativity = <1 2 1>;
+		};
+		cpu@10202 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10202>;
+			enable-method = "psci";
+			arm,associativity = <1 2 2>;
+		};
+		cpu@10203 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10203>;
+			enable-method = "psci";
+			arm,associativity = <1 2 3>;
+		};
+		cpu@10204 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10204>;
+			enable-method = "psci";
+			arm,associativity = <1 2 4>;
+		};
+		cpu@10205 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10205>;
+			enable-method = "psci";
+			arm,associativity = <1 2 5>;
+		};
+		cpu@10206 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10206>;
+			enable-method = "psci";
+			arm,associativity = <1 2 6>;
+		};
+		cpu@10207 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10207>;
+			enable-method = "psci";
+			arm,associativity = <1 2 7>;
+		};
+		cpu@10208 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10208>;
+			enable-method = "psci";
+			arm,associativity = <1 2 8>;
+		};
+		cpu@10209 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10209>;
+			enable-method = "psci";
+			arm,associativity = <1 2 9>;
+		};
+		cpu@1020a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020a>;
+			enable-method = "psci";
+			arm,associativity = <1 2 10>;
+		};
+		cpu@1020b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020b>;
+			enable-method = "psci";
+			arm,associativity = <1 2 11>;
+		};
+		cpu@1020c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020c>;
+			enable-method = "psci";
+			arm,associativity = <1 2 12>;
+		};
+		cpu@1020d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020d>;
+			enable-method = "psci";
+			arm,associativity = <1 2 13>;
+		};
+		cpu@1020e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020e>;
+			enable-method = "psci";
+			arm,associativity = <1 2 14>;
+		};
+		cpu@1020f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020f>;
+			enable-method = "psci";
+			arm,associativity = <1 2 15>;
+		};
+	};
+
+	timer {
+		compatible = "arm,armv8-timer";
+		interrupts = <1 13 0xff01>,
+		             <1 14 0xff01>,
+		             <1 11 0xff01>,
+		             <1 10 0xff01>;
+	};
+
+	soc {
+		compatible = "simple-bus";
+		#address-cells = <2>;
+		#size-cells = <2>;
+		ranges;
+		arm,associativity = <0 0xffff 0xffff>;
+
+		refclk50mhz: refclk50mhz {
+			compatible = "fixed-clock";
+			#clock-cells = <0>;
+			clock-frequency = <50000000>;
+			clock-output-names = "refclk50mhz";
+		};
+
+		gic0: interrupt-controller@8010,00000000 {
+			compatible = "arm,gic-v3";
+			#interrupt-cells = <3>;
+			#redistributor-regions = <2>;
+			interrupt-controller;
+			reg = <0x8010 0x00000000 0x0 0x010000>, /* GICD */
+			      <0x8010 0x80000000 0x0 0x600000>, /* GICR Node 0 */
+			      <0x9010 0x80000000 0x0 0x600000>; /* GICR Node 1 */
+			interrupts = <1 9 0xf04>;
+		};
+
+		uaa0: serial@87e0,24000000 {
+			compatible = "arm,pl011", "arm,primecell";
+			reg = <0x87e0 0x24000000 0x0 0x1000>;
+			interrupts = <1 21 4>;
+			clocks = <&refclk50mhz>;
+			clock-names = "apb_pclk";
+		};
+
+		uaa1: serial@87e0,25000000 {
+			compatible = "arm,pl011", "arm,primecell";
+			reg = <0x87e0 0x25000000 0x0 0x1000>;
+			interrupts = <1 22 4>;
+			clocks = <&refclk50mhz>;
+			clock-names = "apb_pclk";
+		};
+	};
+};
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH v5 4/4] arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node topology.
@ 2015-08-14 16:39   ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

adding dt file for Cavium's Thunder SoC in 2 Node topology
using arm,associativity device node property.

Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
---
 arch/arm64/boot/dts/cavium/Makefile             |   2 +-
 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
 3 files changed, 869 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
 create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi

diff --git a/arch/arm64/boot/dts/cavium/Makefile b/arch/arm64/boot/dts/cavium/Makefile
index e34f89d..7fe7067 100644
--- a/arch/arm64/boot/dts/cavium/Makefile
+++ b/arch/arm64/boot/dts/cavium/Makefile
@@ -1,4 +1,4 @@
-dtb-$(CONFIG_ARCH_THUNDER) += thunder-88xx.dtb
+dtb-$(CONFIG_ARCH_THUNDER) += thunder-88xx.dtb thunder-88xx-2n.dtb
 
 always		:= $(dtb-y)
 subdir-y	:= $(dts-dirs)
diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
new file mode 100644
index 0000000..adbd3a9
--- /dev/null
+++ b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
@@ -0,0 +1,78 @@
+/*
+ * Cavium Thunder DTS file - Thunder board description
+ *
+ * Copyright (C) 2014, Cavium Inc.
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This library is free software; you can redistribute it and/or
+ *     modify it under the terms of the GNU General Public License as
+ *     published by the Free Software Foundation; either version 2 of the
+ *     License, or (at your option) any later version.
+ *
+ *     This library is distributed in the hope that it will be useful,
+ *     but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *     GNU General Public License for more details.
+ *
+ *     You should have received a copy of the GNU General Public
+ *     License along with this library; if not, write to the Free
+ *     Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ *     MA 02110-1301 USA
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ *     obtaining a copy of this software and associated documentation
+ *     files (the "Software"), to deal in the Software without
+ *     restriction, including without limitation the rights to use,
+ *     copy, modify, merge, publish, distribute, sublicense, and/or
+ *     sell copies of the Software, and to permit persons to whom the
+ *     Software is furnished to do so, subject to the following
+ *     conditions:
+ *
+ *     The above copyright notice and this permission notice shall be
+ *     included in all copies or substantial portions of the Software.
+ *
+ *     THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ *     EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ *     OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ *     NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ *     HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ *     WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ *     FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ *     OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/dts-v1/;
+
+/include/ "thunder-88xx-2n.dtsi"
+
+/ {
+	model = "Cavium ThunderX CN88XX board";
+	compatible = "cavium,thunder-88xx";
+	arm,associativity-reference-points = <0>;
+
+	aliases {
+		serial0 = &uaa0;
+		serial1 = &uaa1;
+	};
+
+	memory at 00000000 {
+		device_type = "memory";
+		reg = <0x0 0x00000000 0x0 0x80000000>;
+		/* socket 0, no specific cluster, core */
+		arm,associativity = <0 0xffff 0xffff>;
+	};
+
+	memory at 10000000000 {
+		device_type = "memory";
+		reg = <0x100 0x00000000 0x0 0x80000000>;
+		/* socket 1, no specific cluster, core */
+		arm,associativity = <1 0xffff 0xffff>;
+	};
+
+};
diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
new file mode 100644
index 0000000..dc6f8ea
--- /dev/null
+++ b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
@@ -0,0 +1,790 @@
+/*
+ * Cavium Thunder DTS file - Thunder SoC description
+ *
+ * Copyright (C) 2014, Cavium Inc.
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This library is free software; you can redistribute it and/or
+ *     modify it under the terms of the GNU General Public License as
+ *     published by the Free Software Foundation; either version 2 of the
+ *     License, or (at your option) any later version.
+ *
+ *     This library is distributed in the hope that it will be useful,
+ *     but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *     GNU General Public License for more details.
+ *
+ *     You should have received a copy of the GNU General Public
+ *     License along with this library; if not, write to the Free
+ *     Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ *     MA 02110-1301 USA
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ *     obtaining a copy of this software and associated documentation
+ *     files (the "Software"), to deal in the Software without
+ *     restriction, including without limitation the rights to use,
+ *     copy, modify, merge, publish, distribute, sublicense, and/or
+ *     sell copies of the Software, and to permit persons to whom the
+ *     Software is furnished to do so, subject to the following
+ *     conditions:
+ *
+ *     The above copyright notice and this permission notice shall be
+ *     included in all copies or substantial portions of the Software.
+ *
+ *     THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ *     EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ *     OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ *     NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ *     HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ *     WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ *     FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ *     OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/ {
+	compatible = "cavium,thunder-88xx";
+	interrupt-parent = <&gic0>;
+	#address-cells = <2>;
+	#size-cells = <2>;
+
+	psci {
+		compatible = "arm,psci-0.2";
+		method = "smc";
+	};
+
+	cpus {
+		#address-cells = <2>;
+		#size-cells = <0>;
+
+		cpu at 000 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x000>;
+			enable-method = "psci";
+			/* socket 0, cluster 0, core 0*/
+			arm,associativity = <0 0 0>;
+		};
+		cpu at 001 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x001>;
+			enable-method = "psci";
+			arm,associativity = <0 0 1>;
+		};
+		cpu at 002 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x002>;
+			enable-method = "psci";
+			arm,associativity = <0 0 2>;
+		};
+		cpu at 003 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x003>;
+			enable-method = "psci";
+			arm,associativity = <0 0 3>;
+		};
+		cpu at 004 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x004>;
+			enable-method = "psci";
+			arm,associativity = <0 0 4>;
+		};
+		cpu at 005 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x005>;
+			enable-method = "psci";
+			arm,associativity = <0 0 5>;
+		};
+		cpu at 006 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x006>;
+			enable-method = "psci";
+			arm,associativity = <0 0 6>;
+		};
+		cpu at 007 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x007>;
+			enable-method = "psci";
+			arm,associativity = <0 0 7>;
+		};
+		cpu at 008 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x008>;
+			enable-method = "psci";
+			arm,associativity = <0 0 8>;
+		};
+		cpu at 009 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x009>;
+			enable-method = "psci";
+			arm,associativity = <0 0 9>;
+		};
+		cpu at 00a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 10>;
+		};
+		cpu at 00b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00b>;
+			enable-method = "psci";
+			arm,associativity = <0 0 11>;
+		};
+		cpu at 00c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00c>;
+			enable-method = "psci";
+			arm,associativity = <0 0 12>;
+		};
+		cpu at 00d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00d>;
+			enable-method = "psci";
+			arm,associativity = <0 0 13>;
+		};
+		cpu at 00e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00e>;
+			enable-method = "psci";
+			arm,associativity = <0 0 14>;
+		};
+		cpu at 00f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00f>;
+			enable-method = "psci";
+			arm,associativity = <0 0 15>;
+		};
+		cpu at 100 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x100>;
+			enable-method = "psci";
+			arm,associativity = <0 1 0>;
+		};
+		cpu at 101 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x101>;
+			enable-method = "psci";
+			arm,associativity = <0 1 1>;
+		};
+		cpu at 102 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x102>;
+			enable-method = "psci";
+			arm,associativity = <0 1 2>;
+		};
+		cpu at 103 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x103>;
+			enable-method = "psci";
+			arm,associativity = <0 1 3>;
+		};
+		cpu at 104 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x104>;
+			enable-method = "psci";
+			arm,associativity = <0 1 4>;
+		};
+		cpu at 105 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x105>;
+			enable-method = "psci";
+			arm,associativity = <0 1 5>;
+		};
+		cpu at 106 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x106>;
+			enable-method = "psci";
+			arm,associativity = <0 1 6>;
+		};
+		cpu at 107 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x107>;
+			enable-method = "psci";
+			arm,associativity = <0 1 7>;
+		};
+		cpu at 108 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x108>;
+			enable-method = "psci";
+			arm,associativity = <0 1 8>;
+		};
+		cpu at 109 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x109>;
+			enable-method = "psci";
+			arm,associativity = <0 1 9>;
+		};
+		cpu at 10a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10a>;
+			enable-method = "psci";
+			arm,associativity = <0 1 10>;
+		};
+		cpu at 10b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10b>;
+			enable-method = "psci";
+			arm,associativity = <0 1 11>;
+		};
+		cpu at 10c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10c>;
+			enable-method = "psci";
+			arm,associativity = <0 1 12>;
+		};
+		cpu at 10d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10d>;
+			enable-method = "psci";
+			arm,associativity = <0 1 13>;
+		};
+		cpu at 10e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10e>;
+			enable-method = "psci";
+			arm,associativity = <0 1 14>;
+		};
+		cpu at 10f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10f>;
+			enable-method = "psci";
+			arm,associativity = <0 1 15>;
+		};
+		cpu at 200 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x200>;
+			enable-method = "psci";
+			arm,associativity = <0 2 0>;
+		};
+		cpu at 201 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x201>;
+			enable-method = "psci";
+			arm,associativity = <0 2 1>;
+		};
+		cpu at 202 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x202>;
+			enable-method = "psci";
+			arm,associativity = <0 2 2>;
+		};
+		cpu at 203 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x203>;
+			enable-method = "psci";
+			arm,associativity = <0 2 3>;
+		};
+		cpu at 204 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x204>;
+			enable-method = "psci";
+			arm,associativity = <0 2 4>;
+		};
+		cpu at 205 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x205>;
+			enable-method = "psci";
+			arm,associativity = <0 2 5>;
+		};
+		cpu at 206 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x206>;
+			enable-method = "psci";
+			arm,associativity = <0 2 6>;
+		};
+		cpu at 207 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x207>;
+			enable-method = "psci";
+			arm,associativity = <0 2 7>;
+		};
+		cpu at 208 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x208>;
+			enable-method = "psci";
+			arm,associativity = <0 2 8>;
+		};
+		cpu at 209 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x209>;
+			enable-method = "psci";
+			arm,associativity = <0 2 9>;
+		};
+		cpu at 20a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20a>;
+			enable-method = "psci";
+			arm,associativity = <0 2 10>;
+		};
+		cpu at 20b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20b>;
+			enable-method = "psci";
+			arm,associativity = <0 2 11>;
+		};
+		cpu at 20c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20c>;
+			enable-method = "psci";
+			arm,associativity = <0 2 12>;
+		};
+		cpu at 20d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20d>;
+			enable-method = "psci";
+			arm,associativity = <0 2 13>;
+		};
+		cpu at 20e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20e>;
+			enable-method = "psci";
+			arm,associativity = <0 2 14>;
+		};
+		cpu at 20f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20f>;
+			enable-method = "psci";
+			arm,associativity = <0 2 15>;
+		};
+		cpu at 10000 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10000>;
+			enable-method = "psci";
+			/* socket 1, cluster 0, core 0*/
+			arm,associativity = <1 0 0>;
+		};
+		cpu at 10001 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10001>;
+			enable-method = "psci";
+			arm,associativity = <1 0 1>;
+		};
+		cpu at 10002 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10002>;
+			enable-method = "psci";
+			arm,associativity = <1 0 2>;
+		};
+		cpu at 10003 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10003>;
+			enable-method = "psci";
+			arm,associativity = <1 0 3>;
+		};
+		cpu at 10004 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10004>;
+			enable-method = "psci";
+			arm,associativity = <1 0 4>;
+		};
+		cpu at 10005 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10005>;
+			enable-method = "psci";
+			arm,associativity = <1 0 5>;
+		};
+		cpu at 10006 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10006>;
+			enable-method = "psci";
+			arm,associativity = <1 0 6>;
+		};
+		cpu at 10007 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10007>;
+			enable-method = "psci";
+			arm,associativity = <1 0 7>;
+		};
+		cpu at 10008 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10008>;
+			enable-method = "psci";
+			arm,associativity = <1 0 8>;
+		};
+		cpu at 10009 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10009>;
+			enable-method = "psci";
+			arm,associativity = <1 0 9>;
+		};
+		cpu at 1000a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000a>;
+			enable-method = "psci";
+			arm,associativity = <1 0 10>;
+		};
+		cpu at 1000b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 11>;
+		};
+		cpu at 1000c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 12>;
+		};
+		cpu at 1000d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 13>;
+		};
+		cpu at 1000e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 14>;
+		};
+		cpu at 1000f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 15>;
+		};
+		cpu at 10100 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10100>;
+			enable-method = "psci";
+			arm,associativity = <1 1 0>;
+		};
+		cpu at 10101 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10101>;
+			enable-method = "psci";
+			arm,associativity = <1 1 1>;
+		};
+		cpu at 10102 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10102>;
+			enable-method = "psci";
+			arm,associativity = <1 1 2>;
+		};
+		cpu at 10103 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10103>;
+			enable-method = "psci";
+			arm,associativity = <1 1 3>;
+		};
+		cpu at 10104 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10104>;
+			enable-method = "psci";
+			arm,associativity = <1 1 4>;
+		};
+		cpu at 10105 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10105>;
+			enable-method = "psci";
+			arm,associativity = <1 1 5>;
+		};
+		cpu at 10106 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10106>;
+			enable-method = "psci";
+			arm,associativity = <1 1 6>;
+		};
+		cpu at 10107 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10107>;
+			enable-method = "psci";
+			arm,associativity = <1 1 7>;
+		};
+		cpu at 10108 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10108>;
+			enable-method = "psci";
+			arm,associativity = <1 1 8>;
+		};
+		cpu at 10109 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10109>;
+			enable-method = "psci";
+			arm,associativity = <1 1 9>;
+		};
+		cpu at 1010a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010a>;
+			enable-method = "psci";
+			arm,associativity = <1 1 10>;
+		};
+		cpu at 1010b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010b>;
+			enable-method = "psci";
+			arm,associativity = <1 1 11>;
+		};
+		cpu at 1010c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010c>;
+			enable-method = "psci";
+			arm,associativity = <1 1 12>;
+		};
+		cpu at 1010d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010d>;
+			enable-method = "psci";
+			arm,associativity = <1 1 13>;
+		};
+		cpu at 1010e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010e>;
+			enable-method = "psci";
+			arm,associativity = <1 1 14>;
+		};
+		cpu at 1010f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010f>;
+			enable-method = "psci";
+			arm,associativity = <1 1 15>;
+		};
+		cpu at 10200 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10200>;
+			enable-method = "psci";
+			arm,associativity = <1 2 0>;
+		};
+		cpu at 10201 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10201>;
+			enable-method = "psci";
+			arm,associativity = <1 2 1>;
+		};
+		cpu at 10202 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10202>;
+			enable-method = "psci";
+			arm,associativity = <1 2 2>;
+		};
+		cpu at 10203 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10203>;
+			enable-method = "psci";
+			arm,associativity = <1 2 3>;
+		};
+		cpu at 10204 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10204>;
+			enable-method = "psci";
+			arm,associativity = <1 2 4>;
+		};
+		cpu at 10205 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10205>;
+			enable-method = "psci";
+			arm,associativity = <1 2 5>;
+		};
+		cpu at 10206 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10206>;
+			enable-method = "psci";
+			arm,associativity = <1 2 6>;
+		};
+		cpu at 10207 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10207>;
+			enable-method = "psci";
+			arm,associativity = <1 2 7>;
+		};
+		cpu at 10208 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10208>;
+			enable-method = "psci";
+			arm,associativity = <1 2 8>;
+		};
+		cpu at 10209 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10209>;
+			enable-method = "psci";
+			arm,associativity = <1 2 9>;
+		};
+		cpu at 1020a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020a>;
+			enable-method = "psci";
+			arm,associativity = <1 2 10>;
+		};
+		cpu at 1020b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020b>;
+			enable-method = "psci";
+			arm,associativity = <1 2 11>;
+		};
+		cpu at 1020c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020c>;
+			enable-method = "psci";
+			arm,associativity = <1 2 12>;
+		};
+		cpu at 1020d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020d>;
+			enable-method = "psci";
+			arm,associativity = <1 2 13>;
+		};
+		cpu at 1020e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020e>;
+			enable-method = "psci";
+			arm,associativity = <1 2 14>;
+		};
+		cpu at 1020f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020f>;
+			enable-method = "psci";
+			arm,associativity = <1 2 15>;
+		};
+	};
+
+	timer {
+		compatible = "arm,armv8-timer";
+		interrupts = <1 13 0xff01>,
+		             <1 14 0xff01>,
+		             <1 11 0xff01>,
+		             <1 10 0xff01>;
+	};
+
+	soc {
+		compatible = "simple-bus";
+		#address-cells = <2>;
+		#size-cells = <2>;
+		ranges;
+		arm,associativity = <0 0xffff 0xffff>;
+
+		refclk50mhz: refclk50mhz {
+			compatible = "fixed-clock";
+			#clock-cells = <0>;
+			clock-frequency = <50000000>;
+			clock-output-names = "refclk50mhz";
+		};
+
+		gic0: interrupt-controller at 8010,00000000 {
+			compatible = "arm,gic-v3";
+			#interrupt-cells = <3>;
+			#redistributor-regions = <2>;
+			interrupt-controller;
+			reg = <0x8010 0x00000000 0x0 0x010000>, /* GICD */
+			      <0x8010 0x80000000 0x0 0x600000>, /* GICR Node 0 */
+			      <0x9010 0x80000000 0x0 0x600000>; /* GICR Node 1 */
+			interrupts = <1 9 0xf04>;
+		};
+
+		uaa0: serial at 87e0,24000000 {
+			compatible = "arm,pl011", "arm,primecell";
+			reg = <0x87e0 0x24000000 0x0 0x1000>;
+			interrupts = <1 21 4>;
+			clocks = <&refclk50mhz>;
+			clock-names = "apb_pclk";
+		};
+
+		uaa1: serial at 87e0,25000000 {
+			compatible = "arm,pl011", "arm,primecell";
+			reg = <0x87e0 0x25000000 0x0 0x1000>;
+			interrupts = <1 22 4>;
+			clocks = <&refclk50mhz>;
+			clock-names = "apb_pclk";
+		};
+	};
+};
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
  2015-08-14 16:39 ` Ganapatrao Kulkarni
@ 2015-08-14 16:44     ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:44 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	Grant Likely, Leif Lindholm, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	Ard Biesheuvel, msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring,
	Steve Capper, Hanjun Guo, Al Stone, Arnd Bergmann, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala

On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
<gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org> wrote:
> v5:
>         - created base verion of numa.c which creates dummy numa without using dt
>           on single socket platforms. Then added patches for dt support.
>         - Incorporated review comments from Hanjun Guo.
>
> v4:
> done changes as per Arnd review comments.
>
> v3:
> Added changes to support numa on arm64 based platforms.
> Tested these patches on cavium's multinode(2 node topology) platform.
> In this patchset, defined and implemented dt bindings for numa mapping
> for core and memory using device node property arm,associativity.
>
> v2:
> Defined and implemented numa map for memory, cores to node and
> proximity distance matrix of nodes.
>
> v1:
> Initial patchset to support numa on arm64 platforms.
>
> Note:
>         1. This patchset is tested for numa with dt on
>            thunderx single socket and dual socket boards.
>         2. Numa DT booting needs the dt memory nodes, which are deleted in current efi-stub,
>         hence to try numa with dt, you need to rebase with ard's patchset.
>         http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling
>
>
> Ganapatrao Kulkarni (4):
>   arm64, numa: adding numa support for arm64 platforms.
>   Documentation: arm64/arm: dt bindings for numa.
>   arm64, numa: adding numa support for arm64 platforms.
>   arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
>     topology.
>
>  Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
>  arch/arm64/Kconfig                              |  32 +
>  arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
>  arch/arm64/include/asm/mmzone.h                 |  32 +
>  arch/arm64/include/asm/numa.h                   |  49 ++
>  arch/arm64/kernel/Makefile                      |   1 +
>  arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
>  arch/arm64/kernel/setup.c                       |   9 +
>  arch/arm64/kernel/smp.c                         |   3 +
>  arch/arm64/mm/Makefile                          |   1 +
>  arch/arm64/mm/init.c                            |  34 +-
>  arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
>  14 files changed, 2115 insertions(+), 7 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
>  create mode 100644 arch/arm64/include/asm/mmzone.h
>  create mode 100644 arch/arm64/include/asm/numa.h
>  create mode 100644 arch/arm64/kernel/dt_numa.c
>  create mode 100644 arch/arm64/mm/numa.c
>
> --
> 1.8.1.4
>
sorry, minor correction, this patchset has 4 patches.

PATCH v5 0/8 => PATCH v5 0/4
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
@ 2015-08-14 16:44     ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-14 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
<gkulkarni@caviumnetworks.com> wrote:
> v5:
>         - created base verion of numa.c which creates dummy numa without using dt
>           on single socket platforms. Then added patches for dt support.
>         - Incorporated review comments from Hanjun Guo.
>
> v4:
> done changes as per Arnd review comments.
>
> v3:
> Added changes to support numa on arm64 based platforms.
> Tested these patches on cavium's multinode(2 node topology) platform.
> In this patchset, defined and implemented dt bindings for numa mapping
> for core and memory using device node property arm,associativity.
>
> v2:
> Defined and implemented numa map for memory, cores to node and
> proximity distance matrix of nodes.
>
> v1:
> Initial patchset to support numa on arm64 platforms.
>
> Note:
>         1. This patchset is tested for numa with dt on
>            thunderx single socket and dual socket boards.
>         2. Numa DT booting needs the dt memory nodes, which are deleted in current efi-stub,
>         hence to try numa with dt, you need to rebase with ard's patchset.
>         http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling
>
>
> Ganapatrao Kulkarni (4):
>   arm64, numa: adding numa support for arm64 platforms.
>   Documentation: arm64/arm: dt bindings for numa.
>   arm64, numa: adding numa support for arm64 platforms.
>   arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
>     topology.
>
>  Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
>  arch/arm64/Kconfig                              |  32 +
>  arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
>  arch/arm64/include/asm/mmzone.h                 |  32 +
>  arch/arm64/include/asm/numa.h                   |  49 ++
>  arch/arm64/kernel/Makefile                      |   1 +
>  arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
>  arch/arm64/kernel/setup.c                       |   9 +
>  arch/arm64/kernel/smp.c                         |   3 +
>  arch/arm64/mm/Makefile                          |   1 +
>  arch/arm64/mm/init.c                            |  34 +-
>  arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
>  14 files changed, 2115 insertions(+), 7 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
>  create mode 100644 arch/arm64/include/asm/mmzone.h
>  create mode 100644 arch/arm64/include/asm/numa.h
>  create mode 100644 arch/arm64/kernel/dt_numa.c
>  create mode 100644 arch/arm64/mm/numa.c
>
> --
> 1.8.1.4
>
sorry, minor correction, this patchset has 4 patches.

PATCH v5 0/8 => PATCH v5 0/4

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 4/4] arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node topology.
  2015-08-14 16:39   ` Ganapatrao Kulkarni
@ 2015-08-18  6:16       ` Jisheng Zhang
  -1 siblings, 0 replies; 94+ messages in thread
From: Jisheng Zhang @ 2015-08-18  6:16 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will.Deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	pawel.moll-5wv7dgnIgG8, mark.rutland-5wv7dgnIgG8,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

Dear Ganapatrao,

On Fri, 14 Aug 2015 22:09:34 +0530
Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org> wrote:

> adding dt file for Cavium's Thunder SoC in 2 Node topology
> using arm,associativity device node property.
> 
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
> ---
>  arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
>  3 files changed, 869 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
> 
> diff --git a/arch/arm64/boot/dts/cavium/Makefile b/arch/arm64/boot/dts/cavium/Makefile
> index e34f89d..7fe7067 100644
> --- a/arch/arm64/boot/dts/cavium/Makefile
> +++ b/arch/arm64/boot/dts/cavium/Makefile
> @@ -1,4 +1,4 @@
> -dtb-$(CONFIG_ARCH_THUNDER) += thunder-88xx.dtb
> +dtb-$(CONFIG_ARCH_THUNDER) += thunder-88xx.dtb thunder-88xx-2n.dtb
>  
>  always		:= $(dtb-y)
>  subdir-y	:= $(dts-dirs)
> diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
> new file mode 100644
> index 0000000..adbd3a9
> --- /dev/null
> +++ b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
> @@ -0,0 +1,78 @@
> +/*
> + * Cavium Thunder DTS file - Thunder board description
> + *
> + * Copyright (C) 2014, Cavium Inc.
> + *
> + * This file is dual-licensed: you can use it either under the terms
> + * of the GPL or the X11 license, at your option. Note that this dual
> + * licensing only applies to this file, and not this project as a
> + * whole.
> + *
> + *  a) This library is free software; you can redistribute it and/or
> + *     modify it under the terms of the GNU General Public License as
> + *     published by the Free Software Foundation; either version 2 of the
> + *     License, or (at your option) any later version.
> + *
> + *     This library is distributed in the hope that it will be useful,
> + *     but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *     GNU General Public License for more details.
> + *
> + *     You should have received a copy of the GNU General Public
> + *     License along with this library; if not, write to the Free
> + *     Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
> + *     MA 02110-1301 USA

>From the kernel logs, seems this FSF mailing address isn't valid.
It's better to remove this paragraph about writing to FSF

I also submitted a patch to remove this paragraph from existing thunder-88xx dts:
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358225.html

Thanks,
Jisheng
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 4/4] arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node topology.
@ 2015-08-18  6:16       ` Jisheng Zhang
  0 siblings, 0 replies; 94+ messages in thread
From: Jisheng Zhang @ 2015-08-18  6:16 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Ganapatrao,

On Fri, 14 Aug 2015 22:09:34 +0530
Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com> wrote:

> adding dt file for Cavium's Thunder SoC in 2 Node topology
> using arm,associativity device node property.
> 
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
> ---
>  arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
>  3 files changed, 869 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
> 
> diff --git a/arch/arm64/boot/dts/cavium/Makefile b/arch/arm64/boot/dts/cavium/Makefile
> index e34f89d..7fe7067 100644
> --- a/arch/arm64/boot/dts/cavium/Makefile
> +++ b/arch/arm64/boot/dts/cavium/Makefile
> @@ -1,4 +1,4 @@
> -dtb-$(CONFIG_ARCH_THUNDER) += thunder-88xx.dtb
> +dtb-$(CONFIG_ARCH_THUNDER) += thunder-88xx.dtb thunder-88xx-2n.dtb
>  
>  always		:= $(dtb-y)
>  subdir-y	:= $(dts-dirs)
> diff --git a/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
> new file mode 100644
> index 0000000..adbd3a9
> --- /dev/null
> +++ b/arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
> @@ -0,0 +1,78 @@
> +/*
> + * Cavium Thunder DTS file - Thunder board description
> + *
> + * Copyright (C) 2014, Cavium Inc.
> + *
> + * This file is dual-licensed: you can use it either under the terms
> + * of the GPL or the X11 license, at your option. Note that this dual
> + * licensing only applies to this file, and not this project as a
> + * whole.
> + *
> + *  a) This library is free software; you can redistribute it and/or
> + *     modify it under the terms of the GNU General Public License as
> + *     published by the Free Software Foundation; either version 2 of the
> + *     License, or (at your option) any later version.
> + *
> + *     This library is distributed in the hope that it will be useful,
> + *     but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *     GNU General Public License for more details.
> + *
> + *     You should have received a copy of the GNU General Public
> + *     License along with this library; if not, write to the Free
> + *     Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
> + *     MA 02110-1301 USA

>From the kernel logs, seems this FSF mailing address isn't valid.
It's better to remove this paragraph about writing to FSF

I also submitted a patch to remove this paragraph from existing thunder-88xx dts:
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358225.html

Thanks,
Jisheng

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
  2015-08-14 16:44     ` Ganapatrao Kulkarni
@ 2015-08-20  6:50         ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-20  6:50 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	Grant Likely, Leif Lindholm, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	Ard Biesheuvel, msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring,
	Steve Capper, Hanjun Guo, Al Stone, Arnd Bergmann, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala, Prasun Kapoor

Hi Maintainers,


On Fri, Aug 14, 2015 at 10:14 PM, Ganapatrao Kulkarni
<gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
> <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org> wrote:
>> v5:
>>         - created base verion of numa.c which creates dummy numa without using dt
>>           on single socket platforms. Then added patches for dt support.
>>         - Incorporated review comments from Hanjun Guo.
>>
>> v4:
>> done changes as per Arnd review comments.
>>
>> v3:
>> Added changes to support numa on arm64 based platforms.
>> Tested these patches on cavium's multinode(2 node topology) platform.
>> In this patchset, defined and implemented dt bindings for numa mapping
>> for core and memory using device node property arm,associativity.
>>
>> v2:
>> Defined and implemented numa map for memory, cores to node and
>> proximity distance matrix of nodes.
>>
>> v1:
>> Initial patchset to support numa on arm64 platforms.
>>
>> Note:
>>         1. This patchset is tested for numa with dt on
>>            thunderx single socket and dual socket boards.
>>         2. Numa DT booting needs the dt memory nodes, which are deleted in current efi-stub,
>>         hence to try numa with dt, you need to rebase with ard's patchset.
>>         http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling
>>
>>
>> Ganapatrao Kulkarni (4):
>>   arm64, numa: adding numa support for arm64 platforms.
>>   Documentation: arm64/arm: dt bindings for numa.
>>   arm64, numa: adding numa support for arm64 platforms.
>>   arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
>>     topology.
>>
>>  Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
>>  arch/arm64/Kconfig                              |  32 +
>>  arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
>>  arch/arm64/include/asm/mmzone.h                 |  32 +
>>  arch/arm64/include/asm/numa.h                   |  49 ++
>>  arch/arm64/kernel/Makefile                      |   1 +
>>  arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
>>  arch/arm64/kernel/setup.c                       |   9 +
>>  arch/arm64/kernel/smp.c                         |   3 +
>>  arch/arm64/mm/Makefile                          |   1 +
>>  arch/arm64/mm/init.c                            |  34 +-
>>  arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
>>  14 files changed, 2115 insertions(+), 7 deletions(-)
>>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
>>  create mode 100644 arch/arm64/include/asm/mmzone.h
>>  create mode 100644 arch/arm64/include/asm/numa.h
>>  create mode 100644 arch/arm64/kernel/dt_numa.c
>>  create mode 100644 arch/arm64/mm/numa.c
>>
>> --
>> 1.8.1.4
>>
> sorry, minor correction, this patchset has 4 patches.
>
> PATCH v5 0/8 => PATCH v5 0/4

please review this patchset.

thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
@ 2015-08-20  6:50         ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-20  6:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Maintainers,


On Fri, Aug 14, 2015 at 10:14 PM, Ganapatrao Kulkarni
<gpkulkarni@gmail.com> wrote:
> On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
> <gkulkarni@caviumnetworks.com> wrote:
>> v5:
>>         - created base verion of numa.c which creates dummy numa without using dt
>>           on single socket platforms. Then added patches for dt support.
>>         - Incorporated review comments from Hanjun Guo.
>>
>> v4:
>> done changes as per Arnd review comments.
>>
>> v3:
>> Added changes to support numa on arm64 based platforms.
>> Tested these patches on cavium's multinode(2 node topology) platform.
>> In this patchset, defined and implemented dt bindings for numa mapping
>> for core and memory using device node property arm,associativity.
>>
>> v2:
>> Defined and implemented numa map for memory, cores to node and
>> proximity distance matrix of nodes.
>>
>> v1:
>> Initial patchset to support numa on arm64 platforms.
>>
>> Note:
>>         1. This patchset is tested for numa with dt on
>>            thunderx single socket and dual socket boards.
>>         2. Numa DT booting needs the dt memory nodes, which are deleted in current efi-stub,
>>         hence to try numa with dt, you need to rebase with ard's patchset.
>>         http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling
>>
>>
>> Ganapatrao Kulkarni (4):
>>   arm64, numa: adding numa support for arm64 platforms.
>>   Documentation: arm64/arm: dt bindings for numa.
>>   arm64, numa: adding numa support for arm64 platforms.
>>   arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
>>     topology.
>>
>>  Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
>>  arch/arm64/Kconfig                              |  32 +
>>  arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>>  arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
>>  arch/arm64/include/asm/mmzone.h                 |  32 +
>>  arch/arm64/include/asm/numa.h                   |  49 ++
>>  arch/arm64/kernel/Makefile                      |   1 +
>>  arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
>>  arch/arm64/kernel/setup.c                       |   9 +
>>  arch/arm64/kernel/smp.c                         |   3 +
>>  arch/arm64/mm/Makefile                          |   1 +
>>  arch/arm64/mm/init.c                            |  34 +-
>>  arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
>>  14 files changed, 2115 insertions(+), 7 deletions(-)
>>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>>  create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
>>  create mode 100644 arch/arm64/include/asm/mmzone.h
>>  create mode 100644 arch/arm64/include/asm/numa.h
>>  create mode 100644 arch/arm64/kernel/dt_numa.c
>>  create mode 100644 arch/arm64/mm/numa.c
>>
>> --
>> 1.8.1.4
>>
> sorry, minor correction, this patchset has 4 patches.
>
> PATCH v5 0/8 => PATCH v5 0/4

please review this patchset.

thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-14 16:39   ` Ganapatrao Kulkarni
@ 2015-08-22 15:06     ` Robert Richter
  -1 siblings, 0 replies; 94+ messages in thread
From: Robert Richter @ 2015-08-22 15:06 UTC (permalink / raw)
  To: Arnd Bergmann, Rob Herring
  Cc: mark.rutland, devicetree, steve.capper, pawel.moll, al.stone,
	ard.biesheuvel, catalin.marinas, ijc+devicetree, Will.Deacon,
	leif.lindholm, rfranz, galak, hanjun.guo, msalter, grant.likely,
	Ganapatrao Kulkarni, linux-arm-kernel, gpkulkarni

On 14.08.15 22:09:32, Ganapatrao Kulkarni wrote:
> DT bindings for numa map for memory, cores and IOs using
> arm,associativity device node property.

Arnd, Rob,

as the change below suggests the same topology syntax as already
implemented for PPC, could you take a look at this one for arm64?
Please ack the devicetree changes, assuming you are fine with it.

All other review comments are addressed so far and there are no open
issues with the patches. This would help us to further drive this
series upstream.

Many thanks,

-Robert


> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
> ---
>  Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
>  1 file changed, 212 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..dc3ef86
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,212 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a NUMA node.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.
> +
> +arm,associativity property is set of 32-bit integers which defines level of
> +topology and boundary in the system at which a significant difference in
> +performance can be measured between cross-device accesses within
> +a single location and those spanning multiple locations.
> +The first cell always contains the broadest subdivision within the system,
> +while the last cell enumerates the individual devices, such as an SMT thread
> +of a CPU, or a bus bridge within an SoC".
> +
> +ex:
> +	/* board 0, socket 0, cluster 0, core 0  thread 0 */
> +	arm,associativity = <0 0 0 0 0>;
> +
> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into
> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.
> +
> +Ex:
> +	arm,associativity-reference-points = <0 1>;
> +	The board Id(index 0) used first to calculate the associativity (node
> +	distance), then follows the  socket id(index 1).
> +
> +	arm,associativity-reference-points = <1 0>;
> +	The socket Id(index 1) used first to calculate the associativity,
> +	then follows the board id(index 0).
> +
> +	arm,associativity-reference-points = <0>;
> +	Only the board Id(index 0) used to calculate the associativity.
> +
> +	arm,associativity-reference-points = <1>;
> +	Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.
> +
> +	arm,associativity-reference-points = <0>;
> +
> +	memory@00c00000 {
> +		device_type = "memory";
> +		reg = <0x0 0x00c00000 0x0 0x80000000>;
> +		/* board 0, socket 0, no specific core */
> +		arm,associativity = <0 0 0xffff>;
> +	};
> +
> +	memory@10000000000 {
> +		device_type = "memory";
> +		reg = <0x100 0x00000000 0x0 0x80000000>;
> +		/* board 1, socket 0, no specific core */
> +		arm,associativity = <1 0 0xffff>;
> +	};
> +
> +	cpus {
> +		#address-cells = <2>;
> +		#size-cells = <0>;
> +
> +		cpu@000 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x000>;
> +			enable-method = "psci";
> +			/* board 0, socket 0, core 0*/
> +			arm,associativity = <0 0 0>;
> +		};
> +		cpu@001 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x001>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 1>;
> +		};
> +		cpu@002 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x002>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 2>;
> +		};
> +		cpu@003 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x003>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 3>;
> +		};
> +		cpu@004 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x004>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 4>;
> +		};
> +		cpu@005 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x005>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 5>;
> +		};
> +		cpu@006 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x006>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 6>;
> +		};
> +		cpu@007 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x007>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 7>;
> +		};
> +		cpu@008 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x008>;
> +			enable-method = "psci";
> +			/* board 1, socket 0, core 0*/
> +			arm,associativity = <1 0 0>;
> +		};
> +		cpu@009 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x009>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 1>;
> +		};
> +		cpu@00a {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00a>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 2>;
> +		};
> +		cpu@00b {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00b>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 3>;
> +		};
> +		cpu@00c {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00c>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 4>;
> +		};
> +		cpu@00d {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00d>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 5>;
> +		};
> +		cpu@00e {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00e>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 6>;
> +		};
> +		cpu@00f {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00f>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 7>;
> +		};
> +	};
> +
> +	pcie0: pcie0@0x8480,00000000 {
> +		compatible = "arm,armv8";
> +		device_type = "pci";
> +		bus-range = <0 255>;
> +		#size-cells = <2>;
> +		#address-cells = <3>;
> +		reg = <0x8480 0x00000000 0 0x10000000>;  /* Configuration space */
> +		ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000 0x70 0x00000000>; /* mem ranges */
> +		/* board 0, socket 0, pci bus 0*/
> +		arm,associativity = <0 0 0>;
> +        };
> -- 
> 1.8.1.4
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-22 15:06     ` Robert Richter
  0 siblings, 0 replies; 94+ messages in thread
From: Robert Richter @ 2015-08-22 15:06 UTC (permalink / raw)
  To: linux-arm-kernel

On 14.08.15 22:09:32, Ganapatrao Kulkarni wrote:
> DT bindings for numa map for memory, cores and IOs using
> arm,associativity device node property.

Arnd, Rob,

as the change below suggests the same topology syntax as already
implemented for PPC, could you take a look at this one for arm64?
Please ack the devicetree changes, assuming you are fine with it.

All other review comments are addressed so far and there are no open
issues with the patches. This would help us to further drive this
series upstream.

Many thanks,

-Robert


> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
> ---
>  Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
>  1 file changed, 212 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..dc3ef86
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,212 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a NUMA node.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.
> +
> +arm,associativity property is set of 32-bit integers which defines level of
> +topology and boundary in the system at which a significant difference in
> +performance can be measured between cross-device accesses within
> +a single location and those spanning multiple locations.
> +The first cell always contains the broadest subdivision within the system,
> +while the last cell enumerates the individual devices, such as an SMT thread
> +of a CPU, or a bus bridge within an SoC".
> +
> +ex:
> +	/* board 0, socket 0, cluster 0, core 0  thread 0 */
> +	arm,associativity = <0 0 0 0 0>;
> +
> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into
> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.
> +
> +Ex:
> +	arm,associativity-reference-points = <0 1>;
> +	The board Id(index 0) used first to calculate the associativity (node
> +	distance), then follows the  socket id(index 1).
> +
> +	arm,associativity-reference-points = <1 0>;
> +	The socket Id(index 1) used first to calculate the associativity,
> +	then follows the board id(index 0).
> +
> +	arm,associativity-reference-points = <0>;
> +	Only the board Id(index 0) used to calculate the associativity.
> +
> +	arm,associativity-reference-points = <1>;
> +	Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.
> +
> +	arm,associativity-reference-points = <0>;
> +
> +	memory at 00c00000 {
> +		device_type = "memory";
> +		reg = <0x0 0x00c00000 0x0 0x80000000>;
> +		/* board 0, socket 0, no specific core */
> +		arm,associativity = <0 0 0xffff>;
> +	};
> +
> +	memory at 10000000000 {
> +		device_type = "memory";
> +		reg = <0x100 0x00000000 0x0 0x80000000>;
> +		/* board 1, socket 0, no specific core */
> +		arm,associativity = <1 0 0xffff>;
> +	};
> +
> +	cpus {
> +		#address-cells = <2>;
> +		#size-cells = <0>;
> +
> +		cpu at 000 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x000>;
> +			enable-method = "psci";
> +			/* board 0, socket 0, core 0*/
> +			arm,associativity = <0 0 0>;
> +		};
> +		cpu at 001 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x001>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 1>;
> +		};
> +		cpu at 002 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x002>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 2>;
> +		};
> +		cpu at 003 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x003>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 3>;
> +		};
> +		cpu at 004 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x004>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 4>;
> +		};
> +		cpu at 005 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x005>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 5>;
> +		};
> +		cpu at 006 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x006>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 6>;
> +		};
> +		cpu at 007 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x007>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 7>;
> +		};
> +		cpu at 008 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x008>;
> +			enable-method = "psci";
> +			/* board 1, socket 0, core 0*/
> +			arm,associativity = <1 0 0>;
> +		};
> +		cpu at 009 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x009>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 1>;
> +		};
> +		cpu at 00a {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00a>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 2>;
> +		};
> +		cpu at 00b {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00b>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 3>;
> +		};
> +		cpu at 00c {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00c>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 4>;
> +		};
> +		cpu at 00d {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00d>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 5>;
> +		};
> +		cpu at 00e {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00e>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 6>;
> +		};
> +		cpu at 00f {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00f>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 7>;
> +		};
> +	};
> +
> +	pcie0: pcie0 at 0x8480,00000000 {
> +		compatible = "arm,armv8";
> +		device_type = "pci";
> +		bus-range = <0 255>;
> +		#size-cells = <2>;
> +		#address-cells = <3>;
> +		reg = <0x8480 0x00000000 0 0x10000000>;  /* Configuration space */
> +		ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000 0x70 0x00000000>; /* mem ranges */
> +		/* board 0, socket 0, pci bus 0*/
> +		arm,associativity = <0 0 0>;
> +        };
> -- 
> 1.8.1.4
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-14 16:39   ` Ganapatrao Kulkarni
@ 2015-08-23 21:49       ` Rob Herring
  -1 siblings, 0 replies; 94+ messages in thread
From: Rob Herring @ 2015-08-23 21:49 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	Grant Likely, Leif Lindholm, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	Ard Biesheuvel, Mark Salter, Rob Herring, Steve Capper,
	Hanjun Guo, Al Stone, Arnd Bergmann, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

On Fri, Aug 14, 2015 at 11:39 AM, Ganapatrao Kulkarni
<gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org> wrote:
> DT bindings for numa map for memory, cores and IOs using
> arm,associativity device node property.
>
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>

Given this matches PPC, looks fine to me.

Acked-by: Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

> ---
>  Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
>  1 file changed, 212 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..dc3ef86
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,212 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a NUMA node.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.
> +
> +arm,associativity property is set of 32-bit integers which defines level of
> +topology and boundary in the system at which a significant difference in
> +performance can be measured between cross-device accesses within
> +a single location and those spanning multiple locations.
> +The first cell always contains the broadest subdivision within the system,
> +while the last cell enumerates the individual devices, such as an SMT thread
> +of a CPU, or a bus bridge within an SoC".
> +
> +ex:
> +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
> +       arm,associativity = <0 0 0 0 0>;
> +
> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into
> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.
> +
> +Ex:
> +       arm,associativity-reference-points = <0 1>;
> +       The board Id(index 0) used first to calculate the associativity (node
> +       distance), then follows the  socket id(index 1).
> +
> +       arm,associativity-reference-points = <1 0>;
> +       The socket Id(index 1) used first to calculate the associativity,
> +       then follows the board id(index 0).
> +
> +       arm,associativity-reference-points = <0>;
> +       Only the board Id(index 0) used to calculate the associativity.
> +
> +       arm,associativity-reference-points = <1>;
> +       Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.
> +
> +       arm,associativity-reference-points = <0>;
> +
> +       memory@00c00000 {
> +               device_type = "memory";
> +               reg = <0x0 0x00c00000 0x0 0x80000000>;
> +               /* board 0, socket 0, no specific core */
> +               arm,associativity = <0 0 0xffff>;
> +       };
> +
> +       memory@10000000000 {
> +               device_type = "memory";
> +               reg = <0x100 0x00000000 0x0 0x80000000>;
> +               /* board 1, socket 0, no specific core */
> +               arm,associativity = <1 0 0xffff>;
> +       };
> +
> +       cpus {
> +               #address-cells = <2>;
> +               #size-cells = <0>;
> +
> +               cpu@000 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x000>;
> +                       enable-method = "psci";
> +                       /* board 0, socket 0, core 0*/
> +                       arm,associativity = <0 0 0>;
> +               };
> +               cpu@001 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x001>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 1>;
> +               };
> +               cpu@002 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x002>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 2>;
> +               };
> +               cpu@003 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x003>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 3>;
> +               };
> +               cpu@004 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x004>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 4>;
> +               };
> +               cpu@005 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x005>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 5>;
> +               };
> +               cpu@006 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x006>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 6>;
> +               };
> +               cpu@007 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x007>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 7>;
> +               };
> +               cpu@008 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x008>;
> +                       enable-method = "psci";
> +                       /* board 1, socket 0, core 0*/
> +                       arm,associativity = <1 0 0>;
> +               };
> +               cpu@009 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x009>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 1>;
> +               };
> +               cpu@00a {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00a>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 2>;
> +               };
> +               cpu@00b {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00b>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 3>;
> +               };
> +               cpu@00c {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00c>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 4>;
> +               };
> +               cpu@00d {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00d>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 5>;
> +               };
> +               cpu@00e {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00e>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 6>;
> +               };
> +               cpu@00f {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00f>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 7>;
> +               };
> +       };
> +
> +       pcie0: pcie0@0x8480,00000000 {
> +               compatible = "arm,armv8";
> +               device_type = "pci";
> +               bus-range = <0 255>;
> +               #size-cells = <2>;
> +               #address-cells = <3>;
> +               reg = <0x8480 0x00000000 0 0x10000000>;  /* Configuration space */
> +               ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000 0x70 0x00000000>; /* mem ranges */
> +               /* board 0, socket 0, pci bus 0*/
> +               arm,associativity = <0 0 0>;
> +        };
> --
> 1.8.1.4
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-23 21:49       ` Rob Herring
  0 siblings, 0 replies; 94+ messages in thread
From: Rob Herring @ 2015-08-23 21:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 14, 2015 at 11:39 AM, Ganapatrao Kulkarni
<gkulkarni@caviumnetworks.com> wrote:
> DT bindings for numa map for memory, cores and IOs using
> arm,associativity device node property.
>
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>

Given this matches PPC, looks fine to me.

Acked-by: Rob Herring <robh@kernel.org>

> ---
>  Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
>  1 file changed, 212 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..dc3ef86
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,212 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a NUMA node.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.
> +
> +arm,associativity property is set of 32-bit integers which defines level of
> +topology and boundary in the system at which a significant difference in
> +performance can be measured between cross-device accesses within
> +a single location and those spanning multiple locations.
> +The first cell always contains the broadest subdivision within the system,
> +while the last cell enumerates the individual devices, such as an SMT thread
> +of a CPU, or a bus bridge within an SoC".
> +
> +ex:
> +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
> +       arm,associativity = <0 0 0 0 0>;
> +
> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into
> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.
> +
> +Ex:
> +       arm,associativity-reference-points = <0 1>;
> +       The board Id(index 0) used first to calculate the associativity (node
> +       distance), then follows the  socket id(index 1).
> +
> +       arm,associativity-reference-points = <1 0>;
> +       The socket Id(index 1) used first to calculate the associativity,
> +       then follows the board id(index 0).
> +
> +       arm,associativity-reference-points = <0>;
> +       Only the board Id(index 0) used to calculate the associativity.
> +
> +       arm,associativity-reference-points = <1>;
> +       Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.
> +
> +       arm,associativity-reference-points = <0>;
> +
> +       memory at 00c00000 {
> +               device_type = "memory";
> +               reg = <0x0 0x00c00000 0x0 0x80000000>;
> +               /* board 0, socket 0, no specific core */
> +               arm,associativity = <0 0 0xffff>;
> +       };
> +
> +       memory at 10000000000 {
> +               device_type = "memory";
> +               reg = <0x100 0x00000000 0x0 0x80000000>;
> +               /* board 1, socket 0, no specific core */
> +               arm,associativity = <1 0 0xffff>;
> +       };
> +
> +       cpus {
> +               #address-cells = <2>;
> +               #size-cells = <0>;
> +
> +               cpu at 000 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x000>;
> +                       enable-method = "psci";
> +                       /* board 0, socket 0, core 0*/
> +                       arm,associativity = <0 0 0>;
> +               };
> +               cpu at 001 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x001>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 1>;
> +               };
> +               cpu at 002 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x002>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 2>;
> +               };
> +               cpu at 003 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x003>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 3>;
> +               };
> +               cpu at 004 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x004>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 4>;
> +               };
> +               cpu at 005 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x005>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 5>;
> +               };
> +               cpu at 006 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x006>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 6>;
> +               };
> +               cpu at 007 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x007>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 7>;
> +               };
> +               cpu at 008 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x008>;
> +                       enable-method = "psci";
> +                       /* board 1, socket 0, core 0*/
> +                       arm,associativity = <1 0 0>;
> +               };
> +               cpu at 009 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x009>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 1>;
> +               };
> +               cpu at 00a {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00a>;
> +                       enable-method = "psci";
> +                       arm,associativity = <0 0 2>;
> +               };
> +               cpu at 00b {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00b>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 3>;
> +               };
> +               cpu at 00c {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00c>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 4>;
> +               };
> +               cpu at 00d {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00d>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 5>;
> +               };
> +               cpu at 00e {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00e>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 6>;
> +               };
> +               cpu at 00f {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x00f>;
> +                       enable-method = "psci";
> +                       arm,associativity = <1 0 7>;
> +               };
> +       };
> +
> +       pcie0: pcie0 at 0x8480,00000000 {
> +               compatible = "arm,armv8";
> +               device_type = "pci";
> +               bus-range = <0 255>;
> +               #size-cells = <2>;
> +               #address-cells = <3>;
> +               reg = <0x8480 0x00000000 0 0x10000000>;  /* Configuration space */
> +               ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000 0x70 0x00000000>; /* mem ranges */
> +               /* board 0, socket 0, pci bus 0*/
> +               arm,associativity = <0 0 0>;
> +        };
> --
> 1.8.1.4
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-14 16:39   ` Ganapatrao Kulkarni
@ 2015-08-28 11:32       ` Matthias Brugger
  -1 siblings, 0 replies; 94+ messages in thread
From: Matthias Brugger @ 2015-08-28 11:32 UTC (permalink / raw)
  To: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will.Deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	pawel.moll-5wv7dgnIgG8, mark.rutland-5wv7dgnIgG8,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	galak-sgV2jX0FEOL9JmXXK+q4OQ
  Cc: gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

On 14/08/15 18:39, Ganapatrao Kulkarni wrote:
> DT bindings for numa map for memory, cores and IOs using
> arm,associativity device node property.
>
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
> ---
>   Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
>   1 file changed, 212 insertions(+)
>   create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..dc3ef86
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,212 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a NUMA node.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.
> +
> +arm,associativity property is set of 32-bit integers which defines level of
> +topology and boundary in the system at which a significant difference in
> +performance can be measured between cross-device accesses within
> +a single location and those spanning multiple locations.
> +The first cell always contains the broadest subdivision within the system,
> +while the last cell enumerates the individual devices, such as an SMT thread
> +of a CPU, or a bus bridge within an SoC".
> +
> +ex:
> +	/* board 0, socket 0, cluster 0, core 0  thread 0 */
> +	arm,associativity = <0 0 0 0 0>;
> +
> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into
> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.
> +
> +Ex:
> +	arm,associativity-reference-points = <0 1>;
> +	The board Id(index 0) used first to calculate the associativity (node
> +	distance), then follows the  socket id(index 1).
> +
> +	arm,associativity-reference-points = <1 0>;
> +	The socket Id(index 1) used first to calculate the associativity,
> +	then follows the board id(index 0).
> +
> +	arm,associativity-reference-points = <0>;
> +	Only the board Id(index 0) used to calculate the associativity.
> +
> +	arm,associativity-reference-points = <1>;
> +	Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.
> +
> +	arm,associativity-reference-points = <0>;
> +
> +	memory@00c00000 {
> +		device_type = "memory";
> +		reg = <0x0 0x00c00000 0x0 0x80000000>;
> +		/* board 0, socket 0, no specific core */
> +		arm,associativity = <0 0 0xffff>;
> +	};
> +
> +	memory@10000000000 {
> +		device_type = "memory";
> +		reg = <0x100 0x00000000 0x0 0x80000000>;
> +		/* board 1, socket 0, no specific core */
> +		arm,associativity = <1 0 0xffff>;
> +	};
> +
> +	cpus {
> +		#address-cells = <2>;
> +		#size-cells = <0>;
> +
> +		cpu@000 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x000>;
> +			enable-method = "psci";
> +			/* board 0, socket 0, core 0*/
> +			arm,associativity = <0 0 0>;
> +		};
> +		cpu@001 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x001>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 1>;
> +		};
> +		cpu@002 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x002>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 2>;
> +		};
> +		cpu@003 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x003>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 3>;
> +		};
> +		cpu@004 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x004>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 4>;
> +		};
> +		cpu@005 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x005>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 5>;
> +		};
> +		cpu@006 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x006>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 6>;
> +		};
> +		cpu@007 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x007>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 7>;
> +		};
> +		cpu@008 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x008>;
> +			enable-method = "psci";
> +			/* board 1, socket 0, core 0*/
> +			arm,associativity = <1 0 0>;
> +		};
> +		cpu@009 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x009>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 1>;
> +		};
> +		cpu@00a {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00a>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 2>;

Nit: this should be
arm,associativity = <1 0 2>;

> +		};
> +		cpu@00b {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00b>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 3>;
> +		};
> +		cpu@00c {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00c>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 4>;
> +		};
> +		cpu@00d {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00d>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 5>;
> +		};
> +		cpu@00e {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00e>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 6>;
> +		};
> +		cpu@00f {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00f>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 7>;
> +		};
> +	};
> +
> +	pcie0: pcie0@0x8480,00000000 {
> +		compatible = "arm,armv8";
> +		device_type = "pci";
> +		bus-range = <0 255>;
> +		#size-cells = <2>;
> +		#address-cells = <3>;
> +		reg = <0x8480 0x00000000 0 0x10000000>;  /* Configuration space */
> +		ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000 0x70 0x00000000>; /* mem ranges */
> +		/* board 0, socket 0, pci bus 0*/
> +		arm,associativity = <0 0 0>;
> +        };
>

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-28 11:32       ` Matthias Brugger
  0 siblings, 0 replies; 94+ messages in thread
From: Matthias Brugger @ 2015-08-28 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/08/15 18:39, Ganapatrao Kulkarni wrote:
> DT bindings for numa map for memory, cores and IOs using
> arm,associativity device node property.
>
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
> ---
>   Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
>   1 file changed, 212 insertions(+)
>   create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..dc3ef86
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,212 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a NUMA node.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.
> +
> +arm,associativity property is set of 32-bit integers which defines level of
> +topology and boundary in the system at which a significant difference in
> +performance can be measured between cross-device accesses within
> +a single location and those spanning multiple locations.
> +The first cell always contains the broadest subdivision within the system,
> +while the last cell enumerates the individual devices, such as an SMT thread
> +of a CPU, or a bus bridge within an SoC".
> +
> +ex:
> +	/* board 0, socket 0, cluster 0, core 0  thread 0 */
> +	arm,associativity = <0 0 0 0 0>;
> +
> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into
> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.
> +
> +Ex:
> +	arm,associativity-reference-points = <0 1>;
> +	The board Id(index 0) used first to calculate the associativity (node
> +	distance), then follows the  socket id(index 1).
> +
> +	arm,associativity-reference-points = <1 0>;
> +	The socket Id(index 1) used first to calculate the associativity,
> +	then follows the board id(index 0).
> +
> +	arm,associativity-reference-points = <0>;
> +	Only the board Id(index 0) used to calculate the associativity.
> +
> +	arm,associativity-reference-points = <1>;
> +	Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.
> +
> +	arm,associativity-reference-points = <0>;
> +
> +	memory at 00c00000 {
> +		device_type = "memory";
> +		reg = <0x0 0x00c00000 0x0 0x80000000>;
> +		/* board 0, socket 0, no specific core */
> +		arm,associativity = <0 0 0xffff>;
> +	};
> +
> +	memory at 10000000000 {
> +		device_type = "memory";
> +		reg = <0x100 0x00000000 0x0 0x80000000>;
> +		/* board 1, socket 0, no specific core */
> +		arm,associativity = <1 0 0xffff>;
> +	};
> +
> +	cpus {
> +		#address-cells = <2>;
> +		#size-cells = <0>;
> +
> +		cpu at 000 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x000>;
> +			enable-method = "psci";
> +			/* board 0, socket 0, core 0*/
> +			arm,associativity = <0 0 0>;
> +		};
> +		cpu at 001 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x001>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 1>;
> +		};
> +		cpu at 002 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x002>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 2>;
> +		};
> +		cpu at 003 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x003>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 3>;
> +		};
> +		cpu at 004 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x004>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 4>;
> +		};
> +		cpu at 005 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x005>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 5>;
> +		};
> +		cpu at 006 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x006>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 6>;
> +		};
> +		cpu at 007 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x007>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 7>;
> +		};
> +		cpu at 008 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x008>;
> +			enable-method = "psci";
> +			/* board 1, socket 0, core 0*/
> +			arm,associativity = <1 0 0>;
> +		};
> +		cpu at 009 {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x009>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 1>;
> +		};
> +		cpu at 00a {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00a>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 2>;

Nit: this should be
arm,associativity = <1 0 2>;

> +		};
> +		cpu at 00b {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00b>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 3>;
> +		};
> +		cpu at 00c {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00c>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 4>;
> +		};
> +		cpu at 00d {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00d>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 5>;
> +		};
> +		cpu at 00e {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00e>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 6>;
> +		};
> +		cpu at 00f {
> +			device_type = "cpu";
> +			compatible =  "arm,armv8";
> +			reg = <0x0 0x00f>;
> +			enable-method = "psci";
> +			arm,associativity = <1 0 7>;
> +		};
> +	};
> +
> +	pcie0: pcie0 at 0x8480,00000000 {
> +		compatible = "arm,armv8";
> +		device_type = "pci";
> +		bus-range = <0 255>;
> +		#size-cells = <2>;
> +		#address-cells = <3>;
> +		reg = <0x8480 0x00000000 0 0x10000000>;  /* Configuration space */
> +		ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000 0x70 0x00000000>; /* mem ranges */
> +		/* board 0, socket 0, pci bus 0*/
> +		arm,associativity = <0 0 0>;
> +        };
>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-14 16:39   ` Ganapatrao Kulkarni
@ 2015-08-28 12:32       ` Mark Rutland
  -1 siblings, 0 replies; 94+ messages in thread
From: Mark Rutland @ 2015-08-28 12:32 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4, Pawel Moll,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

Hi,

On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
> DT bindings for numa map for memory, cores and IOs using
> arm,associativity device node property.

Given this is just a copy of ibm,associativity, I'm not sure I see much
point in renaming the properties.

However, (somewhat counter to that) I'm also concerned that this isn't
sufficient for systems we're beginning to see today (more on that
below), so I don't think a simple copy of ibm,associativity is good
enough.

> 
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
> ---
>  Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
>  1 file changed, 212 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..dc3ef86
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,212 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a NUMA node.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.

Can't there be some inheritance? e.g. all devices on a bus with an
arm,associativity property being assumed to share that value?

> +
> +arm,associativity property is set of 32-bit integers which defines level of

s/set/list/ -- the order is important.

> +topology and boundary in the system at which a significant difference in
> +performance can be measured between cross-device accesses within
> +a single location and those spanning multiple locations.
> +The first cell always contains the broadest subdivision within the system,
> +while the last cell enumerates the individual devices, such as an SMT thread
> +of a CPU, or a bus bridge within an SoC".

While this gives us some hierarchy, this doesn't seem to encode relative
distances at all. That seems like an oversight.

Additionally, I'm somewhat unclear on how what you'd be expected to
provide for this property in cases like ring or mesh interconnects,
where there isn't a strict hierarchy (see systems with ARM's own CCN, or
Tilera's TILE-Mx), but there is some measure of closeness.

Must all of these have the same length? If so, why not have a
#(whatever)-cells property in the root to describe the expected length?
If not, how are they to be interpreted relative to each other?

> +
> +ex:

s/ex/Example:/, please. There's no need to contract that.

> +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
> +       arm,associativity = <0 0 0 0 0>;
> +
> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into

Likeise, s/set/list/

> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.

I'm not clear on why this is necessary; the arm,associativity property
is already ordered from most significant to least significant per its
description.

What does this property achieve?

The description also doesn't describe where this property is expected to
live. The example isn't sufficient to disambiguate that, especially as
it seems like a trivial case.

Is this only expected at the root of the tree? Can it be re-defined in
sub-nodes?

> +
> +Ex:

s/Ex/Example:/, please

> +       arm,associativity-reference-points = <0 1>;
> +       The board Id(index 0) used first to calculate the associativity (node
> +       distance), then follows the  socket id(index 1).
> +
> +       arm,associativity-reference-points = <1 0>;
> +       The socket Id(index 1) used first to calculate the associativity,
> +       then follows the board id(index 0).
> +
> +       arm,associativity-reference-points = <0>;
> +       Only the board Id(index 0) used to calculate the associativity.
> +
> +       arm,associativity-reference-points = <1>;
> +       Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.
> +
> +       arm,associativity-reference-points = <0>;
> +
> +       memory@00c00000 {
> +               device_type = "memory";
> +               reg = <0x0 0x00c00000 0x0 0x80000000>;
> +               /* board 0, socket 0, no specific core */
> +               arm,associativity = <0 0 0xffff>;
> +       };
> +
> +       memory@10000000000 {
> +               device_type = "memory";
> +               reg = <0x100 0x00000000 0x0 0x80000000>;
> +               /* board 1, socket 0, no specific core */
> +               arm,associativity = <1 0 0xffff>;
> +       };
> +
> +       cpus {
> +               #address-cells = <2>;
> +               #size-cells = <0>;
> +
> +               cpu@000 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x000>;
> +                       enable-method = "psci";
> +                       /* board 0, socket 0, core 0*/
> +                       arm,associativity = <0 0 0>;

We should specify w.r.t. memory and CPUs how the property is expected to
be used (e.g. in the CPU nodes rather than the cpu-map, with separate
memory nodes, etc). The generic description of arm,associativity isn't
sufficient to limit confusion there.

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-28 12:32       ` Mark Rutland
  0 siblings, 0 replies; 94+ messages in thread
From: Mark Rutland @ 2015-08-28 12:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
> DT bindings for numa map for memory, cores and IOs using
> arm,associativity device node property.

Given this is just a copy of ibm,associativity, I'm not sure I see much
point in renaming the properties.

However, (somewhat counter to that) I'm also concerned that this isn't
sufficient for systems we're beginning to see today (more on that
below), so I don't think a simple copy of ibm,associativity is good
enough.

> 
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
> ---
>  Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
>  1 file changed, 212 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..dc3ef86
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,212 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a NUMA node.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.

Can't there be some inheritance? e.g. all devices on a bus with an
arm,associativity property being assumed to share that value?

> +
> +arm,associativity property is set of 32-bit integers which defines level of

s/set/list/ -- the order is important.

> +topology and boundary in the system at which a significant difference in
> +performance can be measured between cross-device accesses within
> +a single location and those spanning multiple locations.
> +The first cell always contains the broadest subdivision within the system,
> +while the last cell enumerates the individual devices, such as an SMT thread
> +of a CPU, or a bus bridge within an SoC".

While this gives us some hierarchy, this doesn't seem to encode relative
distances at all. That seems like an oversight.

Additionally, I'm somewhat unclear on how what you'd be expected to
provide for this property in cases like ring or mesh interconnects,
where there isn't a strict hierarchy (see systems with ARM's own CCN, or
Tilera's TILE-Mx), but there is some measure of closeness.

Must all of these have the same length? If so, why not have a
#(whatever)-cells property in the root to describe the expected length?
If not, how are they to be interpreted relative to each other?

> +
> +ex:

s/ex/Example:/, please. There's no need to contract that.

> +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
> +       arm,associativity = <0 0 0 0 0>;
> +
> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into

Likeise, s/set/list/

> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.

I'm not clear on why this is necessary; the arm,associativity property
is already ordered from most significant to least significant per its
description.

What does this property achieve?

The description also doesn't describe where this property is expected to
live. The example isn't sufficient to disambiguate that, especially as
it seems like a trivial case.

Is this only expected at the root of the tree? Can it be re-defined in
sub-nodes?

> +
> +Ex:

s/Ex/Example:/, please

> +       arm,associativity-reference-points = <0 1>;
> +       The board Id(index 0) used first to calculate the associativity (node
> +       distance), then follows the  socket id(index 1).
> +
> +       arm,associativity-reference-points = <1 0>;
> +       The socket Id(index 1) used first to calculate the associativity,
> +       then follows the board id(index 0).
> +
> +       arm,associativity-reference-points = <0>;
> +       Only the board Id(index 0) used to calculate the associativity.
> +
> +       arm,associativity-reference-points = <1>;
> +       Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.
> +
> +       arm,associativity-reference-points = <0>;
> +
> +       memory at 00c00000 {
> +               device_type = "memory";
> +               reg = <0x0 0x00c00000 0x0 0x80000000>;
> +               /* board 0, socket 0, no specific core */
> +               arm,associativity = <0 0 0xffff>;
> +       };
> +
> +       memory at 10000000000 {
> +               device_type = "memory";
> +               reg = <0x100 0x00000000 0x0 0x80000000>;
> +               /* board 1, socket 0, no specific core */
> +               arm,associativity = <1 0 0xffff>;
> +       };
> +
> +       cpus {
> +               #address-cells = <2>;
> +               #size-cells = <0>;
> +
> +               cpu at 000 {
> +                       device_type = "cpu";
> +                       compatible =  "arm,armv8";
> +                       reg = <0x0 0x000>;
> +                       enable-method = "psci";
> +                       /* board 0, socket 0, core 0*/
> +                       arm,associativity = <0 0 0>;

We should specify w.r.t. memory and CPUs how the property is expected to
be used (e.g. in the CPU nodes rather than the cpu-map, with separate
memory nodes, etc). The generic description of arm,associativity isn't
sufficient to limit confusion there.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-28 12:32       ` Mark Rutland
@ 2015-08-28 14:02         ` Rob Herring
  -1 siblings, 0 replies; 94+ messages in thread
From: Rob Herring @ 2015-08-28 14:02 UTC (permalink / raw)
  To: Mark Rutland, Benjamin Herrenschmidt
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4, Pawel Moll,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	galak-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org

+benh

On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
> Hi,
>
> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>> DT bindings for numa map for memory, cores and IOs using
>> arm,associativity device node property.
>
> Given this is just a copy of ibm,associativity, I'm not sure I see much
> point in renaming the properties.

So just keep the ibm? I'm okay with that. That would help move to
common code. Alternatively, we could drop the vendor prefix and have
common code just check for both.

>
> However, (somewhat counter to that) I'm also concerned that this isn't
> sufficient for systems we're beginning to see today (more on that
> below), so I don't think a simple copy of ibm,associativity is good
> enough.
>
>>
>> Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
>> ---
>>  Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
>>  1 file changed, 212 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>
>> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
>> new file mode 100644
>> index 0000000..dc3ef86
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/numa.txt
>> @@ -0,0 +1,212 @@
>> +==============================================================================
>> +NUMA binding description.
>> +==============================================================================
>> +
>> +==============================================================================
>> +1 - Introduction
>> +==============================================================================
>> +
>> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
>> +collections of hardware resources including processors, memory, and I/O buses,
>> +that comprise what is commonly known as a NUMA node.
>> +Processor accesses to memory within the local NUMA node is generally faster
>> +than processor accesses to memory outside of the local NUMA node.
>> +DT defines interfaces that allow the platform to convey NUMA node
>> +topology information to OS.
>> +
>> +==============================================================================
>> +2 - arm,associativity
>> +==============================================================================
>> +The mapping is done using arm,associativity device property.
>> +this property needs to be present in every device node which needs to to be
>> +mapped to numa nodes.
>
> Can't there be some inheritance? e.g. all devices on a bus with an
> arm,associativity property being assumed to share that value?

There is actually already based on kernel code. So the documentation
just needs to be explicit.

>
>> +
>> +arm,associativity property is set of 32-bit integers which defines level of
>
> s/set/list/ -- the order is important.
>
>> +topology and boundary in the system at which a significant difference in
>> +performance can be measured between cross-device accesses within
>> +a single location and those spanning multiple locations.
>> +The first cell always contains the broadest subdivision within the system,
>> +while the last cell enumerates the individual devices, such as an SMT thread
>> +of a CPU, or a bus bridge within an SoC".
>
> While this gives us some hierarchy, this doesn't seem to encode relative
> distances at all. That seems like an oversight.
>
> Additionally, I'm somewhat unclear on how what you'd be expected to
> provide for this property in cases like ring or mesh interconnects,
> where there isn't a strict hierarchy (see systems with ARM's own CCN, or
> Tilera's TILE-Mx), but there is some measure of closeness.
>
> Must all of these have the same length? If so, why not have a
> #(whatever)-cells property in the root to describe the expected length?
> If not, how are they to be interpreted relative to each other?

All points that could be asked of the IBM binding. Perhaps Arnd or Ben
can provide some insight or know who can?

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-28 14:02         ` Rob Herring
  0 siblings, 0 replies; 94+ messages in thread
From: Rob Herring @ 2015-08-28 14:02 UTC (permalink / raw)
  To: linux-arm-kernel

+benh

On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>> DT bindings for numa map for memory, cores and IOs using
>> arm,associativity device node property.
>
> Given this is just a copy of ibm,associativity, I'm not sure I see much
> point in renaming the properties.

So just keep the ibm? I'm okay with that. That would help move to
common code. Alternatively, we could drop the vendor prefix and have
common code just check for both.

>
> However, (somewhat counter to that) I'm also concerned that this isn't
> sufficient for systems we're beginning to see today (more on that
> below), so I don't think a simple copy of ibm,associativity is good
> enough.
>
>>
>> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
>> ---
>>  Documentation/devicetree/bindings/arm/numa.txt | 212 +++++++++++++++++++++++++
>>  1 file changed, 212 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>
>> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
>> new file mode 100644
>> index 0000000..dc3ef86
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/numa.txt
>> @@ -0,0 +1,212 @@
>> +==============================================================================
>> +NUMA binding description.
>> +==============================================================================
>> +
>> +==============================================================================
>> +1 - Introduction
>> +==============================================================================
>> +
>> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
>> +collections of hardware resources including processors, memory, and I/O buses,
>> +that comprise what is commonly known as a NUMA node.
>> +Processor accesses to memory within the local NUMA node is generally faster
>> +than processor accesses to memory outside of the local NUMA node.
>> +DT defines interfaces that allow the platform to convey NUMA node
>> +topology information to OS.
>> +
>> +==============================================================================
>> +2 - arm,associativity
>> +==============================================================================
>> +The mapping is done using arm,associativity device property.
>> +this property needs to be present in every device node which needs to to be
>> +mapped to numa nodes.
>
> Can't there be some inheritance? e.g. all devices on a bus with an
> arm,associativity property being assumed to share that value?

There is actually already based on kernel code. So the documentation
just needs to be explicit.

>
>> +
>> +arm,associativity property is set of 32-bit integers which defines level of
>
> s/set/list/ -- the order is important.
>
>> +topology and boundary in the system at which a significant difference in
>> +performance can be measured between cross-device accesses within
>> +a single location and those spanning multiple locations.
>> +The first cell always contains the broadest subdivision within the system,
>> +while the last cell enumerates the individual devices, such as an SMT thread
>> +of a CPU, or a bus bridge within an SoC".
>
> While this gives us some hierarchy, this doesn't seem to encode relative
> distances at all. That seems like an oversight.
>
> Additionally, I'm somewhat unclear on how what you'd be expected to
> provide for this property in cases like ring or mesh interconnects,
> where there isn't a strict hierarchy (see systems with ARM's own CCN, or
> Tilera's TILE-Mx), but there is some measure of closeness.
>
> Must all of these have the same length? If so, why not have a
> #(whatever)-cells property in the root to describe the expected length?
> If not, how are they to be interpreted relative to each other?

All points that could be asked of the IBM binding. Perhaps Arnd or Ben
can provide some insight or know who can?

Rob

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
  2015-08-14 16:39 ` Ganapatrao Kulkarni
@ 2015-08-28 14:31     ` Matthias Brugger
  -1 siblings, 0 replies; 94+ messages in thread
From: Matthias Brugger @ 2015-08-28 14:31 UTC (permalink / raw)
  To: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will.Deacon-5wv7dgnIgG8,
	catalin.marinas-5wv7dgnIgG8, grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	pawel.moll-5wv7dgnIgG8, mark.rutland-5wv7dgnIgG8,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	galak-sgV2jX0FEOL9JmXXK+q4OQ
  Cc: gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w



On 14/08/15 18:39, Ganapatrao Kulkarni wrote:
> v5:
> 	- created base verion of numa.c which creates dummy numa without using dt
> 	  on single socket platforms. Then added patches for dt support.
> 	- Incorporated review comments from Hanjun Guo.
>
> v4:
> done changes as per Arnd review comments.
>
> v3:
> Added changes to support numa on arm64 based platforms.
> Tested these patches on cavium's multinode(2 node topology) platform.
> In this patchset, defined and implemented dt bindings for numa mapping
> for core and memory using device node property arm,associativity.
>
> v2:
> Defined and implemented numa map for memory, cores to node and
> proximity distance matrix of nodes.
>
> v1:
> Initial patchset to support numa on arm64 platforms.
>
> Note:
> 	1. This patchset is tested for numa with dt on
> 	   thunderx single socket and dual socket boards.
> 	2. Numa DT booting needs the dt memory nodes, which are deleted in current efi-stub,
> 	hence to try numa with dt, you need to rebase with ard's patchset.
> 	http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling
>
>
> Ganapatrao Kulkarni (4):
>    arm64, numa: adding numa support for arm64 platforms.
>    Documentation: arm64/arm: dt bindings for numa.
>    arm64, numa: adding numa support for arm64 platforms.
>    arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
>      topology.
>
>   Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
>   arch/arm64/Kconfig                              |  32 +
>   arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>   arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>   arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
>   arch/arm64/include/asm/mmzone.h                 |  32 +
>   arch/arm64/include/asm/numa.h                   |  49 ++
>   arch/arm64/kernel/Makefile                      |   1 +
>   arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
>   arch/arm64/kernel/setup.c                       |   9 +
>   arch/arm64/kernel/smp.c                         |   3 +
>   arch/arm64/mm/Makefile                          |   1 +
>   arch/arm64/mm/init.c                            |  34 +-
>   arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
>   14 files changed, 2115 insertions(+), 7 deletions(-)
>   create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>   create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>   create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
>   create mode 100644 arch/arm64/include/asm/mmzone.h
>   create mode 100644 arch/arm64/include/asm/numa.h
>   create mode 100644 arch/arm64/kernel/dt_numa.c
>   create mode 100644 arch/arm64/mm/numa.c
>

On which version is this patch-set based?

Regards,
Matthias
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
@ 2015-08-28 14:31     ` Matthias Brugger
  0 siblings, 0 replies; 94+ messages in thread
From: Matthias Brugger @ 2015-08-28 14:31 UTC (permalink / raw)
  To: linux-arm-kernel



On 14/08/15 18:39, Ganapatrao Kulkarni wrote:
> v5:
> 	- created base verion of numa.c which creates dummy numa without using dt
> 	  on single socket platforms. Then added patches for dt support.
> 	- Incorporated review comments from Hanjun Guo.
>
> v4:
> done changes as per Arnd review comments.
>
> v3:
> Added changes to support numa on arm64 based platforms.
> Tested these patches on cavium's multinode(2 node topology) platform.
> In this patchset, defined and implemented dt bindings for numa mapping
> for core and memory using device node property arm,associativity.
>
> v2:
> Defined and implemented numa map for memory, cores to node and
> proximity distance matrix of nodes.
>
> v1:
> Initial patchset to support numa on arm64 platforms.
>
> Note:
> 	1. This patchset is tested for numa with dt on
> 	   thunderx single socket and dual socket boards.
> 	2. Numa DT booting needs the dt memory nodes, which are deleted in current efi-stub,
> 	hence to try numa with dt, you need to rebase with ard's patchset.
> 	http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling
>
>
> Ganapatrao Kulkarni (4):
>    arm64, numa: adding numa support for arm64 platforms.
>    Documentation: arm64/arm: dt bindings for numa.
>    arm64, numa: adding numa support for arm64 platforms.
>    arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
>      topology.
>
>   Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
>   arch/arm64/Kconfig                              |  32 +
>   arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>   arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>   arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790 ++++++++++++++++++++++++
>   arch/arm64/include/asm/mmzone.h                 |  32 +
>   arch/arm64/include/asm/numa.h                   |  49 ++
>   arch/arm64/kernel/Makefile                      |   1 +
>   arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
>   arch/arm64/kernel/setup.c                       |   9 +
>   arch/arm64/kernel/smp.c                         |   3 +
>   arch/arm64/mm/Makefile                          |   1 +
>   arch/arm64/mm/init.c                            |  34 +-
>   arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
>   14 files changed, 2115 insertions(+), 7 deletions(-)
>   create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>   create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>   create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
>   create mode 100644 arch/arm64/include/asm/mmzone.h
>   create mode 100644 arch/arm64/include/asm/numa.h
>   create mode 100644 arch/arm64/kernel/dt_numa.c
>   create mode 100644 arch/arm64/mm/numa.c
>

On which version is this patch-set based?

Regards,
Matthias

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
  2015-08-28 14:31     ` Matthias Brugger
@ 2015-08-28 14:59         ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-28 14:59 UTC (permalink / raw)
  To: Matthias Brugger
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	Grant Likely, Leif Lindholm, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	Ard Biesheuvel, msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring,
	Steve Capper, Hanjun Guo, Al Stone, Arnd Bergmann, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala

On Fri, Aug 28, 2015 at 8:01 PM, Matthias Brugger
<matthias.bgg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>
> On 14/08/15 18:39, Ganapatrao Kulkarni wrote:
>>
>> v5:
>>         - created base verion of numa.c which creates dummy numa without
>> using dt
>>           on single socket platforms. Then added patches for dt support.
>>         - Incorporated review comments from Hanjun Guo.
>>
>> v4:
>> done changes as per Arnd review comments.
>>
>> v3:
>> Added changes to support numa on arm64 based platforms.
>> Tested these patches on cavium's multinode(2 node topology) platform.
>> In this patchset, defined and implemented dt bindings for numa mapping
>> for core and memory using device node property arm,associativity.
>>
>> v2:
>> Defined and implemented numa map for memory, cores to node and
>> proximity distance matrix of nodes.
>>
>> v1:
>> Initial patchset to support numa on arm64 platforms.
>>
>> Note:
>>         1. This patchset is tested for numa with dt on
>>            thunderx single socket and dual socket boards.
>>         2. Numa DT booting needs the dt memory nodes, which are deleted in
>> current efi-stub,
>>         hence to try numa with dt, you need to rebase with ard's patchset.
>>
>> http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling
>>
>>
>> Ganapatrao Kulkarni (4):
>>    arm64, numa: adding numa support for arm64 platforms.
>>    Documentation: arm64/arm: dt bindings for numa.
>>    arm64, numa: adding numa support for arm64 platforms.
>>    arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
>>      topology.
>>
>>   Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
>>   arch/arm64/Kconfig                              |  32 +
>>   arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>>   arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>>   arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790
>> ++++++++++++++++++++++++
>>   arch/arm64/include/asm/mmzone.h                 |  32 +
>>   arch/arm64/include/asm/numa.h                   |  49 ++
>>   arch/arm64/kernel/Makefile                      |   1 +
>>   arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
>>   arch/arm64/kernel/setup.c                       |   9 +
>>   arch/arm64/kernel/smp.c                         |   3 +
>>   arch/arm64/mm/Makefile                          |   1 +
>>   arch/arm64/mm/init.c                            |  34 +-
>>   arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
>>   14 files changed, 2115 insertions(+), 7 deletions(-)
>>   create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>   create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>>   create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
>>   create mode 100644 arch/arm64/include/asm/mmzone.h
>>   create mode 100644 arch/arm64/include/asm/numa.h
>>   create mode 100644 arch/arm64/kernel/dt_numa.c
>>   create mode 100644 arch/arm64/mm/numa.c
>>
>
> On which version is this patch-set based?
4.2 -rc5
>
> Regards,
> Matthias

thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
@ 2015-08-28 14:59         ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-28 14:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 28, 2015 at 8:01 PM, Matthias Brugger
<matthias.bgg@gmail.com> wrote:
>
>
> On 14/08/15 18:39, Ganapatrao Kulkarni wrote:
>>
>> v5:
>>         - created base verion of numa.c which creates dummy numa without
>> using dt
>>           on single socket platforms. Then added patches for dt support.
>>         - Incorporated review comments from Hanjun Guo.
>>
>> v4:
>> done changes as per Arnd review comments.
>>
>> v3:
>> Added changes to support numa on arm64 based platforms.
>> Tested these patches on cavium's multinode(2 node topology) platform.
>> In this patchset, defined and implemented dt bindings for numa mapping
>> for core and memory using device node property arm,associativity.
>>
>> v2:
>> Defined and implemented numa map for memory, cores to node and
>> proximity distance matrix of nodes.
>>
>> v1:
>> Initial patchset to support numa on arm64 platforms.
>>
>> Note:
>>         1. This patchset is tested for numa with dt on
>>            thunderx single socket and dual socket boards.
>>         2. Numa DT booting needs the dt memory nodes, which are deleted in
>> current efi-stub,
>>         hence to try numa with dt, you need to rebase with ard's patchset.
>>
>> http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling
>>
>>
>> Ganapatrao Kulkarni (4):
>>    arm64, numa: adding numa support for arm64 platforms.
>>    Documentation: arm64/arm: dt bindings for numa.
>>    arm64, numa: adding numa support for arm64 platforms.
>>    arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
>>      topology.
>>
>>   Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
>>   arch/arm64/Kconfig                              |  32 +
>>   arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>>   arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>>   arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790
>> ++++++++++++++++++++++++
>>   arch/arm64/include/asm/mmzone.h                 |  32 +
>>   arch/arm64/include/asm/numa.h                   |  49 ++
>>   arch/arm64/kernel/Makefile                      |   1 +
>>   arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
>>   arch/arm64/kernel/setup.c                       |   9 +
>>   arch/arm64/kernel/smp.c                         |   3 +
>>   arch/arm64/mm/Makefile                          |   1 +
>>   arch/arm64/mm/init.c                            |  34 +-
>>   arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
>>   14 files changed, 2115 insertions(+), 7 deletions(-)
>>   create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>   create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>>   create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
>>   create mode 100644 arch/arm64/include/asm/mmzone.h
>>   create mode 100644 arch/arm64/include/asm/numa.h
>>   create mode 100644 arch/arm64/kernel/dt_numa.c
>>   create mode 100644 arch/arm64/mm/numa.c
>>
>
> On which version is this patch-set based?
4.2 -rc5
>
> Regards,
> Matthias

thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
  2015-08-28 14:59         ` Ganapatrao Kulkarni
@ 2015-08-28 15:36             ` Matthias Brugger
  -1 siblings, 0 replies; 94+ messages in thread
From: Matthias Brugger @ 2015-08-28 15:36 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	Grant Likely, Leif Lindholm, rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	Ard Biesheuvel, msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring,
	Steve Capper, Hanjun Guo, Al Stone, Arnd Bergmann, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala



On 28/08/15 16:59, Ganapatrao Kulkarni wrote:
> On Fri, Aug 28, 2015 at 8:01 PM, Matthias Brugger
> <matthias.bgg-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>>
>> On 14/08/15 18:39, Ganapatrao Kulkarni wrote:
>>>
>>> v5:
>>>          - created base verion of numa.c which creates dummy numa without
>>> using dt
>>>            on single socket platforms. Then added patches for dt support.
>>>          - Incorporated review comments from Hanjun Guo.
>>>
>>> v4:
>>> done changes as per Arnd review comments.
>>>
>>> v3:
>>> Added changes to support numa on arm64 based platforms.
>>> Tested these patches on cavium's multinode(2 node topology) platform.
>>> In this patchset, defined and implemented dt bindings for numa mapping
>>> for core and memory using device node property arm,associativity.
>>>
>>> v2:
>>> Defined and implemented numa map for memory, cores to node and
>>> proximity distance matrix of nodes.
>>>
>>> v1:
>>> Initial patchset to support numa on arm64 platforms.
>>>
>>> Note:
>>>          1. This patchset is tested for numa with dt on
>>>             thunderx single socket and dual socket boards.
>>>          2. Numa DT booting needs the dt memory nodes, which are deleted in
>>> current efi-stub,
>>>          hence to try numa with dt, you need to rebase with ard's patchset.
>>>
>>> http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling
>>>
>>>
>>> Ganapatrao Kulkarni (4):
>>>     arm64, numa: adding numa support for arm64 platforms.
>>>     Documentation: arm64/arm: dt bindings for numa.
>>>     arm64, numa: adding numa support for arm64 platforms.
>>>     arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
>>>       topology.
>>>
>>>    Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
>>>    arch/arm64/Kconfig                              |  32 +
>>>    arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>>>    arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>>>    arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790
>>> ++++++++++++++++++++++++
>>>    arch/arm64/include/asm/mmzone.h                 |  32 +
>>>    arch/arm64/include/asm/numa.h                   |  49 ++
>>>    arch/arm64/kernel/Makefile                      |   1 +
>>>    arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
>>>    arch/arm64/kernel/setup.c                       |   9 +
>>>    arch/arm64/kernel/smp.c                         |   3 +
>>>    arch/arm64/mm/Makefile                          |   1 +
>>>    arch/arm64/mm/init.c                            |  34 +-
>>>    arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
>>>    14 files changed, 2115 insertions(+), 7 deletions(-)
>>>    create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>>    create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>>>    create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
>>>    create mode 100644 arch/arm64/include/asm/mmzone.h
>>>    create mode 100644 arch/arm64/include/asm/numa.h
>>>    create mode 100644 arch/arm64/kernel/dt_numa.c
>>>    create mode 100644 arch/arm64/mm/numa.c
>>>
>>
>> On which version is this patch-set based?
> 4.2 -rc5

Seems as if
"arm64: add support for memtest" (36dd9086cb)
is not taken into account. I wasn't able to apply the set without 
conflicts, neither on v4.2-rc5, v4.2-rc8, nor on
arm64-uefi-early-fdt-handling.

Regards,
Matthias
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms.
@ 2015-08-28 15:36             ` Matthias Brugger
  0 siblings, 0 replies; 94+ messages in thread
From: Matthias Brugger @ 2015-08-28 15:36 UTC (permalink / raw)
  To: linux-arm-kernel



On 28/08/15 16:59, Ganapatrao Kulkarni wrote:
> On Fri, Aug 28, 2015 at 8:01 PM, Matthias Brugger
> <matthias.bgg@gmail.com> wrote:
>>
>>
>> On 14/08/15 18:39, Ganapatrao Kulkarni wrote:
>>>
>>> v5:
>>>          - created base verion of numa.c which creates dummy numa without
>>> using dt
>>>            on single socket platforms. Then added patches for dt support.
>>>          - Incorporated review comments from Hanjun Guo.
>>>
>>> v4:
>>> done changes as per Arnd review comments.
>>>
>>> v3:
>>> Added changes to support numa on arm64 based platforms.
>>> Tested these patches on cavium's multinode(2 node topology) platform.
>>> In this patchset, defined and implemented dt bindings for numa mapping
>>> for core and memory using device node property arm,associativity.
>>>
>>> v2:
>>> Defined and implemented numa map for memory, cores to node and
>>> proximity distance matrix of nodes.
>>>
>>> v1:
>>> Initial patchset to support numa on arm64 platforms.
>>>
>>> Note:
>>>          1. This patchset is tested for numa with dt on
>>>             thunderx single socket and dual socket boards.
>>>          2. Numa DT booting needs the dt memory nodes, which are deleted in
>>> current efi-stub,
>>>          hence to try numa with dt, you need to rebase with ard's patchset.
>>>
>>> http://git.linaro.org/people/ard.biesheuvel/linux-arm.git/shortlog/refs/heads/arm64-uefi-early-fdt-handling
>>>
>>>
>>> Ganapatrao Kulkarni (4):
>>>     arm64, numa: adding numa support for arm64 platforms.
>>>     Documentation: arm64/arm: dt bindings for numa.
>>>     arm64, numa: adding numa support for arm64 platforms.
>>>     arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node
>>>       topology.
>>>
>>>    Documentation/devicetree/bindings/arm/numa.txt  | 212 +++++++
>>>    arch/arm64/Kconfig                              |  32 +
>>>    arch/arm64/boot/dts/cavium/Makefile             |   2 +-
>>>    arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts  |  78 +++
>>>    arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi | 790
>>> ++++++++++++++++++++++++
>>>    arch/arm64/include/asm/mmzone.h                 |  32 +
>>>    arch/arm64/include/asm/numa.h                   |  49 ++
>>>    arch/arm64/kernel/Makefile                      |   1 +
>>>    arch/arm64/kernel/dt_numa.c                     | 316 ++++++++++
>>>    arch/arm64/kernel/setup.c                       |   9 +
>>>    arch/arm64/kernel/smp.c                         |   3 +
>>>    arch/arm64/mm/Makefile                          |   1 +
>>>    arch/arm64/mm/init.c                            |  34 +-
>>>    arch/arm64/mm/numa.c                            | 563 +++++++++++++++++
>>>    14 files changed, 2115 insertions(+), 7 deletions(-)
>>>    create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>>    create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dts
>>>    create mode 100644 arch/arm64/boot/dts/cavium/thunder-88xx-2n.dtsi
>>>    create mode 100644 arch/arm64/include/asm/mmzone.h
>>>    create mode 100644 arch/arm64/include/asm/numa.h
>>>    create mode 100644 arch/arm64/kernel/dt_numa.c
>>>    create mode 100644 arch/arm64/mm/numa.c
>>>
>>
>> On which version is this patch-set based?
> 4.2 -rc5

Seems as if
"arm64: add support for memtest" (36dd9086cb)
is not taken into account. I wasn't able to apply the set without 
conflicts, neither on v4.2-rc5, v4.2-rc8, nor on
arm64-uefi-early-fdt-handling.

Regards,
Matthias

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-28 14:02         ` Rob Herring
@ 2015-08-28 21:37             ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-08-28 21:37 UTC (permalink / raw)
  To: Rob Herring, Mark Rutland
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4, Pawel Moll,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	galak-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org

On Fri, 2015-08-28 at 09:02 -0500, Rob Herring wrote:

> So just keep the ibm? I'm okay with that. That would help move to
> common code. Alternatively, we could drop the vendor prefix and have
> common code just check for both.

That wouldn't be the first time we go down that path and it makes sense
imho.

> All points that could be asked of the IBM binding. Perhaps Arnd or 
> Ben can provide some insight or know who can?

They are part of the PAPR specification which we've been trying to get
published for a while now but that hasn't happened yet. Beware that
there are variants of the format based on some other property. There's
also 
"ibm,associativity-reference-points" which is used to calculate
distances. I'll see if I can get you an excerpt of the PAPR chapter, or
reword it, in the next few days (please poke me if I drop the ball next
week).

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-28 21:37             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-08-28 21:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2015-08-28 at 09:02 -0500, Rob Herring wrote:

> So just keep the ibm? I'm okay with that. That would help move to
> common code. Alternatively, we could drop the vendor prefix and have
> common code just check for both.

That wouldn't be the first time we go down that path and it makes sense
imho.

> All points that could be asked of the IBM binding. Perhaps Arnd or 
> Ben can provide some insight or know who can?

They are part of the PAPR specification which we've been trying to get
published for a while now but that hasn't happened yet. Beware that
there are variants of the format based on some other property. There's
also 
"ibm,associativity-reference-points" which is used to calculate
distances. I'll see if I can get you an excerpt of the PAPR chapter, or
reword it, in the next few days (please poke me if I drop the ball next
week).

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-28 14:02         ` Rob Herring
@ 2015-08-29  9:46             ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 94+ messages in thread
From: Leizhen (ThunderTown) @ 2015-08-29  9:46 UTC (permalink / raw)
  To: Rob Herring, Mark Rutland, Benjamin Herrenschmidt
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	al.stone-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg, Will Deacon,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	Pawel Moll, hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org



On 2015/8/28 22:02, Rob Herring wrote:
> +benh
> 
> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
>> Hi,
>>
>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>> DT bindings for numa map for memory, cores and IOs using
>>> arm,associativity device node property.
>>
>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>> point in renaming the properties.
> 
> So just keep the ibm? I'm okay with that. That would help move to
> common code. Alternatively, we could drop the vendor prefix and have
> common code just check for both.
> 

Hi all,

Why not copy the method of ACPI numa? There only three elements should be configured:
1) a cpu belong to which node
2) a memory block belong to which node
3) the distance of each two nodes

The devicetree nodes of numa can be like below:
/ {
	...

	numa-nodes-info {
		node-name: node-description {
			mem-ranges = <...>;
			cpus-list = <...>;
		};

		nodes-distance {
			distance-list = <...>;
		};
	};

	...
};

Sorry, I don't think xxx,associativity is a good method, it's hard to config, and it
seems hardware-dependent. Especially, when we want to support memory hot-add, it's too hard.
Because xxx,associativity have no obvious information about it. Like powerpc, it use another
property: "/ibm,dynamic-reconfiguration-memory".

I spend almost a whole month to implement of_numa(configured by dt-nodes), base upon my opinion
mentioned above. If somebody are interested in it, I can send my patchset to show it.

Regards,
Thunder.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-29  9:46             ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 94+ messages in thread
From: Leizhen (ThunderTown) @ 2015-08-29  9:46 UTC (permalink / raw)
  To: linux-arm-kernel



On 2015/8/28 22:02, Rob Herring wrote:
> +benh
> 
> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> Hi,
>>
>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>> DT bindings for numa map for memory, cores and IOs using
>>> arm,associativity device node property.
>>
>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>> point in renaming the properties.
> 
> So just keep the ibm? I'm okay with that. That would help move to
> common code. Alternatively, we could drop the vendor prefix and have
> common code just check for both.
> 

Hi all,

Why not copy the method of ACPI numa? There only three elements should be configured:
1) a cpu belong to which node
2) a memory block belong to which node
3) the distance of each two nodes

The devicetree nodes of numa can be like below:
/ {
	...

	numa-nodes-info {
		node-name: node-description {
			mem-ranges = <...>;
			cpus-list = <...>;
		};

		nodes-distance {
			distance-list = <...>;
		};
	};

	...
};

Sorry, I don't think xxx,associativity is a good method, it's hard to config, and it
seems hardware-dependent. Especially, when we want to support memory hot-add, it's too hard.
Because xxx,associativity have no obvious information about it. Like powerpc, it use another
property: "/ibm,dynamic-reconfiguration-memory".

I spend almost a whole month to implement of_numa(configured by dt-nodes), base upon my opinion
mentioned above. If somebody are interested in it, I can send my patchset to show it.

Regards,
Thunder.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-29  9:46             ` Leizhen (ThunderTown)
@ 2015-08-29 10:37                 ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-08-29 10:37 UTC (permalink / raw)
  To: Leizhen (ThunderTown), Rob Herring, Mark Rutland
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	al.stone-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg, Will Deacon,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	Pawel Moll, hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org

On Sat, 2015-08-29 at 17:46 +0800, Leizhen (ThunderTown) wrote:
> Why not copy the method of ACPI numa? There only three elements
> should be configured:
> 1) a cpu belong to which node
> 2) a memory block belong to which node
> 3) the distance of each two nodes

This means your are bolting into the DT representation the concept of
"Node" which isn't necessarily very meaningful.

Your system is really a hierarchy of objects. You can have cores on a
chip, already possibly sharing some level of cache or not, you can have
chips on a module, modules linked at various distances, etc...

What is "a node" ?

For example, I have a P8 chip with 2 chips on a module (fast X-bus) and
2 modules (slightly slower A-bus). All 4 chips have 2 memory
controllers each.

Is a "node" a chip or a module ?

The Linux concept of node is too restrictive. The associativity
properties avoid this by allowing you to define as many "levels" of
associativity as you wish. Also since it's right justified, a given
component doesn't need to have all levels (a MC can stop at chip while
cores can go down one more level for example).

The reference points property gives a hint as "interesting" levels can
typically be used as a hint for chosing what Linux will use as a "node"
at least until Linux gets smarter. It can also be used to calculate
distances.

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-29 10:37                 ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-08-29 10:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, 2015-08-29 at 17:46 +0800, Leizhen (ThunderTown) wrote:
> Why not copy the method of ACPI numa? There only three elements
> should be configured:
> 1) a cpu belong to which node
> 2) a memory block belong to which node
> 3) the distance of each two nodes

This means your are bolting into the DT representation the concept of
"Node" which isn't necessarily very meaningful.

Your system is really a hierarchy of objects. You can have cores on a
chip, already possibly sharing some level of cache or not, you can have
chips on a module, modules linked at various distances, etc...

What is "a node" ?

For example, I have a P8 chip with 2 chips on a module (fast X-bus) and
2 modules (slightly slower A-bus). All 4 chips have 2 memory
controllers each.

Is a "node" a chip or a module ?

The Linux concept of node is too restrictive. The associativity
properties avoid this by allowing you to define as many "levels" of
associativity as you wish. Also since it's right justified, a given
component doesn't need to have all levels (a MC can stop at chip while
cores can go down one more level for example).

The reference points property gives a hint as "interesting" levels can
typically be used as a hint for chosing what Linux will use as a "node"
at least until Linux gets smarter. It can also be used to calculate
distances.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-29  9:46             ` Leizhen (ThunderTown)
@ 2015-08-29 14:56                 ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-29 14:56 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Rob Herring, Mark Rutland, Benjamin Herrenschmidt,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	al.stone-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg, Will Deacon,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	Pawel Moll, hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	galak-sgV2jX0FEOL9JmXXK+q4OQ

Hi Thunder,

On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
<thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:
>
>
> On 2015/8/28 22:02, Rob Herring wrote:
>> +benh
>>
>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
>>> Hi,
>>>
>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>> DT bindings for numa map for memory, cores and IOs using
>>>> arm,associativity device node property.
>>>
>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>> point in renaming the properties.
>>
>> So just keep the ibm? I'm okay with that. That would help move to
>> common code. Alternatively, we could drop the vendor prefix and have
>> common code just check for both.
>>
>
> Hi all,
>
> Why not copy the method of ACPI numa? There only three elements should be configured:
> 1) a cpu belong to which node
> 2) a memory block belong to which node
> 3) the distance of each two nodes
>
> The devicetree nodes of numa can be like below:
> / {
>         ...
>
>         numa-nodes-info {
>                 node-name: node-description {
>                         mem-ranges = <...>;
>                         cpus-list = <...>;
>                 };
>
>                 nodes-distance {
>                         distance-list = <...>;
>                 };
>         };
>
>         ...
> };
>
some what similar to what your are proposing is already implemented in
my v2 patchset.
https://lwn.net/Articles/623920/
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html
we have went to associativity property based implementation to keep it
more generic.
i do have both acpi(using linaro/hanjun's patches) and associativity
based implementations on our internal tree
and tested on thunderx platform.
i do see issue in creating numa mapping using ACPI for IOs(for
example, i am not able to create numa mapping for ITS which is on each
node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
talks only about processor and memory.
however associativity is generic and you can apply on any dt node.
> Sorry, I don't think xxx,associativity is a good method, it's hard to config, and it
> seems hardware-dependent. Especially, when we want to support memory hot-add, it's too hard.
> Because xxx,associativity have no obvious information about it. Like powerpc, it use another
> property: "/ibm,dynamic-reconfiguration-memory".
>
> I spend almost a whole month to implement of_numa(configured by dt-nodes), base upon my opinion
> mentioned above. If somebody are interested in it, I can send my patchset to show it.
>
> Regards,
> Thunder.
>
thanks
ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-29 14:56                 ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-08-29 14:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thunder,

On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
<thunder.leizhen@huawei.com> wrote:
>
>
> On 2015/8/28 22:02, Rob Herring wrote:
>> +benh
>>
>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>>> Hi,
>>>
>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>> DT bindings for numa map for memory, cores and IOs using
>>>> arm,associativity device node property.
>>>
>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>> point in renaming the properties.
>>
>> So just keep the ibm? I'm okay with that. That would help move to
>> common code. Alternatively, we could drop the vendor prefix and have
>> common code just check for both.
>>
>
> Hi all,
>
> Why not copy the method of ACPI numa? There only three elements should be configured:
> 1) a cpu belong to which node
> 2) a memory block belong to which node
> 3) the distance of each two nodes
>
> The devicetree nodes of numa can be like below:
> / {
>         ...
>
>         numa-nodes-info {
>                 node-name: node-description {
>                         mem-ranges = <...>;
>                         cpus-list = <...>;
>                 };
>
>                 nodes-distance {
>                         distance-list = <...>;
>                 };
>         };
>
>         ...
> };
>
some what similar to what your are proposing is already implemented in
my v2 patchset.
https://lwn.net/Articles/623920/
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html
we have went to associativity property based implementation to keep it
more generic.
i do have both acpi(using linaro/hanjun's patches) and associativity
based implementations on our internal tree
and tested on thunderx platform.
i do see issue in creating numa mapping using ACPI for IOs(for
example, i am not able to create numa mapping for ITS which is on each
node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
talks only about processor and memory.
however associativity is generic and you can apply on any dt node.
> Sorry, I don't think xxx,associativity is a good method, it's hard to config, and it
> seems hardware-dependent. Especially, when we want to support memory hot-add, it's too hard.
> Because xxx,associativity have no obvious information about it. Like powerpc, it use another
> property: "/ibm,dynamic-reconfiguration-memory".
>
> I spend almost a whole month to implement of_numa(configured by dt-nodes), base upon my opinion
> mentioned above. If somebody are interested in it, I can send my patchset to show it.
>
> Regards,
> Thunder.
>
thanks
ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-29 10:37                 ` Benjamin Herrenschmidt
@ 2015-08-31  1:46                     ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 94+ messages in thread
From: Leizhen (ThunderTown) @ 2015-08-31  1:46 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Rob Herring, Mark Rutland
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	al.stone-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg, Will Deacon,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	Pawel Moll, hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org



On 2015/8/29 18:37, Benjamin Herrenschmidt wrote:
> On Sat, 2015-08-29 at 17:46 +0800, Leizhen (ThunderTown) wrote:
>> Why not copy the method of ACPI numa? There only three elements
>> should be configured:
>> 1) a cpu belong to which node
>> 2) a memory block belong to which node
>> 3) the distance of each two nodes

Sorry, I forgot to write something:
4) a device(maybe a bus device) belongs to which node

For example:
device-name {
        ...
        numa-node = <&node0>;
};

To simplify the discussion, I will not mention device again. Treat both
cpus and devices as masters, memorys as slaves.

A bus is not a master, we allow binding numa node to a bus, because we may
want all devices on the bus to inherit its numa node-id without obvious configured one by one.

> 
> This means your are bolting into the DT representation the concept of
> "Node" which isn't necessarily very meaningful.
> 
> Your system is really a hierarchy of objects. You can have cores on a
> chip, already possibly sharing some level of cache or not, you can have
> chips on a module, modules linked at various distances, etc...
> 
> What is "a node" ?
> 
> For example, I have a P8 chip with 2 chips on a module (fast X-bus) and
> 2 modules (slightly slower A-bus). All 4 chips have 2 memory
> controllers each.
> 
> Is a "node" a chip or a module ?

A numa node is a abstract concept, it needn't related to a real hardware level.
A numa node normally contains both cpus and mems, but may only contains cpus or mems,
or maybe nothing(quite rare). We put cpus or mems into a node, because we want to use
node-distance to implement the nearest memory access, the nearest process schedule.

In your example:
On fast X-bus, have a module contains 2 chips.
On slightly slower A-bus, have 2 modules(treat them as 2 chips).
Each chip contains 2 memory controllers.

Suppose each chip access its local bus memory faster than another.

Case1:
Each chip access its 2 local memory controllers faster than others. Then we can define numa nodes:
node-xbus-0: a chip and 2 local memory.
node-xbus-1: a chip and 2 local memory.
node-abus-0: a chip(module) and 2 local memory.
node-abus-1: a chip(module) and 2 local memory.

Case2:
Each chip access any memory controllers on its local bus are the same. Then we can define numa nodes:
node-xbus: 2 chips and 4 local memory.
node-abus: 2 chips(modules) and 4 local memory.


> 
> The Linux concept of node is too restrictive. The associativity
> properties avoid this by allowing you to define as many "levels" of
> associativity as you wish. Also since it's right justified, a given
> component doesn't need to have all levels (a MC can stop at chip while
> cores can go down one more level for example).
> 
> The reference points property gives a hint as "interesting" levels can
> typically be used as a hint for chosing what Linux will use as a "node"
> at least until Linux gets smarter. It can also be used to calculate
> distances.
> 
> Cheers,
> Ben.
> 
> 
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-31  1:46                     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 94+ messages in thread
From: Leizhen (ThunderTown) @ 2015-08-31  1:46 UTC (permalink / raw)
  To: linux-arm-kernel



On 2015/8/29 18:37, Benjamin Herrenschmidt wrote:
> On Sat, 2015-08-29 at 17:46 +0800, Leizhen (ThunderTown) wrote:
>> Why not copy the method of ACPI numa? There only three elements
>> should be configured:
>> 1) a cpu belong to which node
>> 2) a memory block belong to which node
>> 3) the distance of each two nodes

Sorry, I forgot to write something:
4) a device(maybe a bus device) belongs to which node

For example:
device-name {
        ...
        numa-node = <&node0>;
};

To simplify the discussion, I will not mention device again. Treat both
cpus and devices as masters, memorys as slaves.

A bus is not a master, we allow binding numa node to a bus, because we may
want all devices on the bus to inherit its numa node-id without obvious configured one by one.

> 
> This means your are bolting into the DT representation the concept of
> "Node" which isn't necessarily very meaningful.
> 
> Your system is really a hierarchy of objects. You can have cores on a
> chip, already possibly sharing some level of cache or not, you can have
> chips on a module, modules linked at various distances, etc...
> 
> What is "a node" ?
> 
> For example, I have a P8 chip with 2 chips on a module (fast X-bus) and
> 2 modules (slightly slower A-bus). All 4 chips have 2 memory
> controllers each.
> 
> Is a "node" a chip or a module ?

A numa node is a abstract concept, it needn't related to a real hardware level.
A numa node normally contains both cpus and mems, but may only contains cpus or mems,
or maybe nothing(quite rare). We put cpus or mems into a node, because we want to use
node-distance to implement the nearest memory access, the nearest process schedule.

In your example:
On fast X-bus, have a module contains 2 chips.
On slightly slower A-bus, have 2 modules(treat them as 2 chips).
Each chip contains 2 memory controllers.

Suppose each chip access its local bus memory faster than another.

Case1:
Each chip access its 2 local memory controllers faster than others. Then we can define numa nodes:
node-xbus-0: a chip and 2 local memory.
node-xbus-1: a chip and 2 local memory.
node-abus-0: a chip(module) and 2 local memory.
node-abus-1: a chip(module) and 2 local memory.

Case2:
Each chip access any memory controllers on its local bus are the same. Then we can define numa nodes:
node-xbus: 2 chips and 4 local memory.
node-abus: 2 chips(modules) and 4 local memory.


> 
> The Linux concept of node is too restrictive. The associativity
> properties avoid this by allowing you to define as many "levels" of
> associativity as you wish. Also since it's right justified, a given
> component doesn't need to have all levels (a MC can stop at chip while
> cores can go down one more level for example).
> 
> The reference points property gives a hint as "interesting" levels can
> typically be used as a hint for chosing what Linux will use as a "node"
> at least until Linux gets smarter. It can also be used to calculate
> distances.
> 
> Cheers,
> Ben.
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-29 14:56                 ` Ganapatrao Kulkarni
@ 2015-08-31  2:53                   ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 94+ messages in thread
From: Leizhen (ThunderTown) @ 2015-08-31  2:53 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: Mark Rutland, Benjamin Herrenschmidt, Will Deacon, Pawel Moll,
	al.stone, Prasun Kapoor, Catalin Marinas, grant.likely,
	devicetree, steve.capper, arnd, ijc+devicetree, msalter,
	leif.lindholm, rfranz, robh+dt, linux-arm-kernel, ard.biesheuvel,
	han



On 2015/8/29 22:56, Ganapatrao Kulkarni wrote:
> Hi Thunder,
> 
> On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
> <thunder.leizhen@huawei.com> wrote:
>>
>>
>> On 2015/8/28 22:02, Rob Herring wrote:
>>> +benh
>>>
>>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>>>> Hi,
>>>>
>>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>>> DT bindings for numa map for memory, cores and IOs using
>>>>> arm,associativity device node property.
>>>>
>>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>>> point in renaming the properties.
>>>
>>> So just keep the ibm? I'm okay with that. That would help move to
>>> common code. Alternatively, we could drop the vendor prefix and have
>>> common code just check for both.
>>>
>>
>> Hi all,
>>
>> Why not copy the method of ACPI numa? There only three elements should be configured:
>> 1) a cpu belong to which node
>> 2) a memory block belong to which node
>> 3) the distance of each two nodes
>>
>> The devicetree nodes of numa can be like below:
>> / {
>>         ...
>>
>>         numa-nodes-info {
>>                 node-name: node-description {
>>                         mem-ranges = <...>;
>>                         cpus-list = <...>;
>>                 };
>>
>>                 nodes-distance {
>>                         distance-list = <...>;
>>                 };
>>         };
>>
>>         ...
>> };
>>
> some what similar to what your are proposing is already implemented in
> my v2 patchset.
> https://lwn.net/Articles/623920/
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html

Sorry, I have not read your old version patchsets before.

The basic ideas are consistent, but details are different. I think your v2 patchset may meet some problem:

-------------------------
+- cpu-map:	This property defines the association of range of processors
+		(range of cpu ids) and the proximity domain to which
+		the processor belongs.

+		cpu-map = <0 7 0>,
+			  <8 15 1>;
-------------------------

1.
I am not sure the cpu ids is logical cpu-ids in Linux or the sequence number of the CPU dt-nodes in dts.
The former case: logical cpu-id is allocated by Linux, we can not ensure that cpu0 is the first CPU dt-node.
The latter case: depend on Linux strictly parse CPU dt-nodes follow the sequence in dts.

2. You should put most codes into /drivers/of/, because it can be shared with other ARCHs which base upon devicetree.

Here is my detailed example:
Examples:
/ {
	#address-cells = <2>;
	#size-cells = <2>;

	memory@0 {
		device_type = "memory";
		reg = <0x0 0x00000000 0x0 0x40000000>,
		      <0x1 0x00000000 0x1 0x00000000>,
		      <0x2 0x00000000 0x0 0x40000000>,
		      <0x2 0x80000000 0x0 0x40000000>;
	};

	CPU0: cpu@10000 {
		device_type = "cpu";
		reg = <0x10000>;
		...
	};

	numa-nodes-info {
		node0: cluster0 {
			mem-ranges = <0x0 0x00000000 0x1 0x00000000>;
			cpus-list = <&CPU0 &CPU1>;
		};

		node1: cluster1 {
			mem-ranges = <0x1 0x00000000 0x1 0x00000000>;
			cpus-list = <&CPU2>;
		};

		node2: cluster2 {
			mem-ranges = <0x2 0x00000000 0x1 0x00000000>;
			cpus-list = <&CPU3 &CPU4 &CPU5>;
		};

		nodes-distance {
			distance-list = <&node0 &node1 15>, <&node1 &node2 18>;
		};
	};

> we have went to associativity property based implementation to keep it
> more generic.
> i do have both acpi(using linaro/hanjun's patches) and associativity
> based implementations on our internal tree
> and tested on thunderx platform.
> i do see issue in creating numa mapping using ACPI for IOs(for
> example, i am not able to create numa mapping for ITS which is on each
> node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
> talks only about processor and memory.
> however associativity is generic and you can apply on any dt node.
>> Sorry, I don't think xxx,associativity is a good method, it's hard to config, and it
>> seems hardware-dependent. Especially, when we want to support memory hot-add, it's too hard.
>> Because xxx,associativity have no obvious information about it. Like powerpc, it use another
>> property: "/ibm,dynamic-reconfiguration-memory".
>>
>> I spend almost a whole month to implement of_numa(configured by dt-nodes), base upon my opinion
>> mentioned above. If somebody are interested in it, I can send my patchset to show it.
>>
>> Regards,
>> Thunder.
>>
> thanks
> ganapat
> 
> .
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-08-31  2:53                   ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 94+ messages in thread
From: Leizhen (ThunderTown) @ 2015-08-31  2:53 UTC (permalink / raw)
  To: linux-arm-kernel



On 2015/8/29 22:56, Ganapatrao Kulkarni wrote:
> Hi Thunder,
> 
> On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
> <thunder.leizhen@huawei.com> wrote:
>>
>>
>> On 2015/8/28 22:02, Rob Herring wrote:
>>> +benh
>>>
>>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>>>> Hi,
>>>>
>>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>>> DT bindings for numa map for memory, cores and IOs using
>>>>> arm,associativity device node property.
>>>>
>>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>>> point in renaming the properties.
>>>
>>> So just keep the ibm? I'm okay with that. That would help move to
>>> common code. Alternatively, we could drop the vendor prefix and have
>>> common code just check for both.
>>>
>>
>> Hi all,
>>
>> Why not copy the method of ACPI numa? There only three elements should be configured:
>> 1) a cpu belong to which node
>> 2) a memory block belong to which node
>> 3) the distance of each two nodes
>>
>> The devicetree nodes of numa can be like below:
>> / {
>>         ...
>>
>>         numa-nodes-info {
>>                 node-name: node-description {
>>                         mem-ranges = <...>;
>>                         cpus-list = <...>;
>>                 };
>>
>>                 nodes-distance {
>>                         distance-list = <...>;
>>                 };
>>         };
>>
>>         ...
>> };
>>
> some what similar to what your are proposing is already implemented in
> my v2 patchset.
> https://lwn.net/Articles/623920/
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html

Sorry, I have not read your old version patchsets before.

The basic ideas are consistent, but details are different. I think your v2 patchset may meet some problem:

-------------------------
+- cpu-map:	This property defines the association of range of processors
+		(range of cpu ids) and the proximity domain to which
+		the processor belongs.

+		cpu-map = <0 7 0>,
+			  <8 15 1>;
-------------------------

1.
I am not sure the cpu ids is logical cpu-ids in Linux or the sequence number of the CPU dt-nodes in dts.
The former case: logical cpu-id is allocated by Linux, we can not ensure that cpu0 is the first CPU dt-node.
The latter case: depend on Linux strictly parse CPU dt-nodes follow the sequence in dts.

2. You should put most codes into /drivers/of/, because it can be shared with other ARCHs which base upon devicetree.

Here is my detailed example:
Examples:
/ {
	#address-cells = <2>;
	#size-cells = <2>;

	memory at 0 {
		device_type = "memory";
		reg = <0x0 0x00000000 0x0 0x40000000>,
		      <0x1 0x00000000 0x1 0x00000000>,
		      <0x2 0x00000000 0x0 0x40000000>,
		      <0x2 0x80000000 0x0 0x40000000>;
	};

	CPU0: cpu at 10000 {
		device_type = "cpu";
		reg = <0x10000>;
		...
	};

	numa-nodes-info {
		node0: cluster0 {
			mem-ranges = <0x0 0x00000000 0x1 0x00000000>;
			cpus-list = <&CPU0 &CPU1>;
		};

		node1: cluster1 {
			mem-ranges = <0x1 0x00000000 0x1 0x00000000>;
			cpus-list = <&CPU2>;
		};

		node2: cluster2 {
			mem-ranges = <0x2 0x00000000 0x1 0x00000000>;
			cpus-list = <&CPU3 &CPU4 &CPU5>;
		};

		nodes-distance {
			distance-list = <&node0 &node1 15>, <&node1 &node2 18>;
		};
	};

> we have went to associativity property based implementation to keep it
> more generic.
> i do have both acpi(using linaro/hanjun's patches) and associativity
> based implementations on our internal tree
> and tested on thunderx platform.
> i do see issue in creating numa mapping using ACPI for IOs(for
> example, i am not able to create numa mapping for ITS which is on each
> node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
> talks only about processor and memory.
> however associativity is generic and you can apply on any dt node.
>> Sorry, I don't think xxx,associativity is a good method, it's hard to config, and it
>> seems hardware-dependent. Especially, when we want to support memory hot-add, it's too hard.
>> Because xxx,associativity have no obvious information about it. Like powerpc, it use another
>> property: "/ibm,dynamic-reconfiguration-memory".
>>
>> I spend almost a whole month to implement of_numa(configured by dt-nodes), base upon my opinion
>> mentioned above. If somebody are interested in it, I can send my patchset to show it.
>>
>> Regards,
>> Thunder.
>>
> thanks
> ganapat
> 
> .
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-28 21:37             ` Benjamin Herrenschmidt
@ 2015-09-02 17:11                 ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-02 17:11 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Rob Herring, Mark Rutland, Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4, Pawel Moll,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg

Hi Ben,

On Sat, Aug 29, 2015 at 3:07 AM, Benjamin Herrenschmidt
<benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org> wrote:
> On Fri, 2015-08-28 at 09:02 -0500, Rob Herring wrote:
>
>> So just keep the ibm? I'm okay with that. That would help move to
>> common code. Alternatively, we could drop the vendor prefix and have
>> common code just check for both.
>
> That wouldn't be the first time we go down that path and it makes sense
> imho.
>
>> All points that could be asked of the IBM binding. Perhaps Arnd or
>> Ben can provide some insight or know who can?
>
> They are part of the PAPR specification which we've been trying to get
> published for a while now but that hasn't happened yet. Beware that
> there are variants of the format based on some other property. There's
> also
> "ibm,associativity-reference-points" which is used to calculate
> distances. I'll see if I can get you an excerpt of the PAPR chapter, or
> reword it, in the next few days (please poke me if I drop the ball next
> week).
did you get a chance to write an excerpt of the PAPR chapter?
please share the details.
>
> Cheers,
> Ben.
>
thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-02 17:11                 ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-02 17:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ben,

On Sat, Aug 29, 2015 at 3:07 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Fri, 2015-08-28 at 09:02 -0500, Rob Herring wrote:
>
>> So just keep the ibm? I'm okay with that. That would help move to
>> common code. Alternatively, we could drop the vendor prefix and have
>> common code just check for both.
>
> That wouldn't be the first time we go down that path and it makes sense
> imho.
>
>> All points that could be asked of the IBM binding. Perhaps Arnd or
>> Ben can provide some insight or know who can?
>
> They are part of the PAPR specification which we've been trying to get
> published for a while now but that hasn't happened yet. Beware that
> there are variants of the format based on some other property. There's
> also
> "ibm,associativity-reference-points" which is used to calculate
> distances. I'll see if I can get you an excerpt of the PAPR chapter, or
> reword it, in the next few days (please poke me if I drop the ball next
> week).
did you get a chance to write an excerpt of the PAPR chapter?
please share the details.
>
> Cheers,
> Ben.
>
thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 1/4] arm64, numa: adding numa support for arm64 platforms.
  2015-08-14 16:39   ` Ganapatrao Kulkarni
@ 2015-09-03  9:52       ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-03  9:52 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, Will Deacon, Catalin Marinas
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Likely, Leif Lindholm,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	Hanjun Guo, Al Stone, Arnd Bergmann, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Prasun Kapoor, Robert Richter

Hi Catalin/Will,

On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
<gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org> wrote:
> Adding numa support for arm64 based platforms.
> This patch adds by default the dummy numa node and
> maps all memory and cpus to node 0.
> using this patch, numa can be simulated on single node arm64 platforms.
>
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
> ---
>  arch/arm64/Kconfig              |  26 ++
>  arch/arm64/include/asm/mmzone.h |  32 +++
>  arch/arm64/include/asm/numa.h   |  42 +++
>  arch/arm64/kernel/setup.c       |   9 +
>  arch/arm64/kernel/smp.c         |   2 +
>  arch/arm64/mm/Makefile          |   1 +
>  arch/arm64/mm/init.c            |  34 ++-
>  arch/arm64/mm/numa.c            | 550 ++++++++++++++++++++++++++++++++++++++++
>  8 files changed, 690 insertions(+), 6 deletions(-)
>  create mode 100644 arch/arm64/include/asm/mmzone.h
>  create mode 100644 arch/arm64/include/asm/numa.h
>  create mode 100644 arch/arm64/mm/numa.c
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 43a0c26..fa37a5d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -72,6 +72,7 @@ config ARM64
>         select HAVE_PERF_USER_STACK_DUMP
>         select HAVE_RCU_TABLE_FREE
>         select HAVE_SYSCALL_TRACEPOINTS
> +       select HAVE_MEMBLOCK_NODE_MAP if NUMA
>         select IRQ_DOMAIN
>         select IRQ_FORCED_THREADING
>         select MODULES_USE_ELF_RELA
> @@ -559,6 +560,31 @@ config HOTPLUG_CPU
>           Say Y here to experiment with turning CPUs off and on.  CPUs
>           can be controlled through /sys/devices/system/cpu.
>
> +# Common NUMA Features
> +config NUMA
> +       bool "Numa Memory Allocation and Scheduler Support"
> +       depends on SMP
> +       help
> +         Enable NUMA (Non Uniform Memory Access) support.
> +
> +         The kernel will try to allocate memory used by a CPU on the
> +         local memory controller of the CPU and add some more
> +         NUMA awareness to the kernel.
> +
> +config NODES_SHIFT
> +       int "Maximum NUMA Nodes (as a power of 2)"
> +       range 1 10
> +       default "2"
> +       depends on NEED_MULTIPLE_NODES
> +       help
> +         Specify the maximum number of NUMA Nodes available on the target
> +         system.  Increases memory reserved to accommodate various tables.
> +
> +config USE_PERCPU_NUMA_NODE_ID
> +       def_bool y
> +       depends on NUMA
> +
> +
>  source kernel/Kconfig.preempt
>
>  config UP_LATE_INIT
> diff --git a/arch/arm64/include/asm/mmzone.h b/arch/arm64/include/asm/mmzone.h
> new file mode 100644
> index 0000000..d27ee66
> --- /dev/null
> +++ b/arch/arm64/include/asm/mmzone.h
> @@ -0,0 +1,32 @@
> +#ifndef __ASM_ARM64_MMZONE_H_
> +#define __ASM_ARM64_MMZONE_H_
> +
> +#ifdef CONFIG_NUMA
> +
> +#include <linux/mmdebug.h>
> +#include <asm/smp.h>
> +#include <linux/types.h>
> +#include <asm/numa.h>
> +
> +extern struct pglist_data *node_data[];
> +
> +#define NODE_DATA(nid)         (node_data[nid])
> +
> +
> +struct numa_memblk {
> +       u64                     start;
> +       u64                     end;
> +       int                     nid;
> +};
> +
> +struct numa_meminfo {
> +       int                     nr_blks;
> +       struct numa_memblk      blk[NR_NODE_MEMBLKS];
> +};
> +
> +void __init numa_remove_memblk_from(int idx, struct numa_meminfo *mi);
> +int __init numa_cleanup_meminfo(struct numa_meminfo *mi);
> +void __init numa_reset_distance(void);
> +
> +#endif /* CONFIG_NUMA */
> +#endif /* __ASM_ARM64_MMZONE_H_ */
> diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h
> new file mode 100644
> index 0000000..59b834e
> --- /dev/null
> +++ b/arch/arm64/include/asm/numa.h
> @@ -0,0 +1,42 @@
> +#ifndef _ASM_NUMA_H
> +#define _ASM_NUMA_H
> +
> +#include <linux/nodemask.h>
> +#include <asm/topology.h>
> +
> +#ifdef CONFIG_NUMA
> +
> +#define NR_NODE_MEMBLKS                (MAX_NUMNODES * 2)
> +#define ZONE_ALIGN (1UL << (MAX_ORDER + PAGE_SHIFT))
> +
> +/* currently, arm64 implements flat NUMA topology */
> +#define parent_node(node)      (node)
> +
> +/* dummy definitions for pci functions */
> +#define pcibus_to_node(node)   0
> +#define cpumask_of_pcibus(bus) 0
> +
> +struct __node_cpu_hwid {
> +       u32 node_id;    /* logical node containing this CPU */
> +       u64 cpu_hwid;   /* MPIDR for this CPU */
> +};
> +
> +extern struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
> +extern nodemask_t numa_nodes_parsed __initdata;
> +
> +const struct cpumask *cpumask_of_node(int node);
> +/* Mappings between node number and cpus on that node. */
> +extern cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
> +
> +void __init arm64_numa_init(void);
> +int __init numa_add_memblk(u32 nodeid, u64 start, u64 end);
> +void numa_store_cpu_info(int cpu);
> +void __init build_cpu_to_node_map(void);
> +void __init numa_set_distance(int from, int to, int distance);
> +#else  /* CONFIG_NUMA */
> +static inline void numa_store_cpu_info(int cpu)                { }
> +static inline void arm64_numa_init(void)               { }
> +static inline void build_cpu_to_node_map(void) { }
> +static inline void numa_set_distance(int from, int to, int distance) { }
> +#endif /* CONFIG_NUMA */
> +#endif /* _ASM_NUMA_H */
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 4d4d7ce..6e101eb 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -65,6 +65,7 @@
>  #include <asm/efi.h>
>  #include <asm/virt.h>
>  #include <asm/xen/hypervisor.h>
> +#include <asm/numa.h>
>
>  unsigned long elf_hwcap __read_mostly;
>  EXPORT_SYMBOL_GPL(elf_hwcap);
> @@ -439,6 +440,9 @@ static int __init topology_init(void)
>  {
>         int i;
>
> +       for_each_online_node(i)
> +               register_one_node(i);
> +
>         for_each_possible_cpu(i) {
>                 struct cpu *cpu = &per_cpu(cpu_data.cpu, i);
>                 cpu->hotpluggable = 1;
> @@ -511,7 +515,12 @@ static int c_show(struct seq_file *m, void *v)
>                  * "processor".  Give glibc what it expects.
>                  */
>  #ifdef CONFIG_SMP
> +       if (IS_ENABLED(CONFIG_NUMA)) {
> +               seq_printf(m, "processor\t: %d", i);
> +               seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
> +       } else {
>                 seq_printf(m, "processor\t: %d\n", i);
> +       }
>  #endif
>
>                 /*
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 50fb469..ae3e02c 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -52,6 +52,7 @@
>  #include <asm/sections.h>
>  #include <asm/tlbflush.h>
>  #include <asm/ptrace.h>
> +#include <asm/numa.h>
>
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/ipi.h>
> @@ -124,6 +125,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  static void smp_store_cpu_info(unsigned int cpuid)
>  {
>         store_cpu_topology(cpuid);
> +       numa_store_cpu_info(cpuid);
>  }
>
>  /*
> diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> index 773d37a..bb92d41 100644
> --- a/arch/arm64/mm/Makefile
> +++ b/arch/arm64/mm/Makefile
> @@ -4,3 +4,4 @@ obj-y                           := dma-mapping.o extable.o fault.o init.o \
>                                    context.o proc.o pageattr.o
>  obj-$(CONFIG_HUGETLB_PAGE)     += hugetlbpage.o
>  obj-$(CONFIG_ARM64_PTDUMP)     += dump.o
> +obj-$(CONFIG_NUMA)             += numa.o
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 54da32e..cab384b 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -42,6 +42,7 @@
>  #include <asm/sizes.h>
>  #include <asm/tlb.h>
>  #include <asm/alternative.h>
> +#include <asm/numa.h>
>
>  #include "mm.h"
>
> @@ -77,6 +78,20 @@ static phys_addr_t max_zone_dma_phys(void)
>         return min(offset + (1ULL << 32), memblock_end_of_DRAM());
>  }
>
> +#ifdef CONFIG_NUMA
> +static void __init zone_sizes_init(unsigned long min, unsigned long max)
> +{
> +       unsigned long max_zone_pfns[MAX_NR_ZONES];
> +
> +       memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
> +       if (IS_ENABLED(CONFIG_ZONE_DMA))
> +               max_zone_pfns[ZONE_DMA] = PFN_DOWN(max_zone_dma_phys());
> +       max_zone_pfns[ZONE_NORMAL] = max;
> +
> +       free_area_init_nodes(max_zone_pfns);
> +}
> +
> +#else
>  static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  {
>         struct memblock_region *reg;
> @@ -115,6 +130,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>
>         free_area_init_node(0, zone_size, min, zhole_size);
>  }
> +#endif /* CONFIG_NUMA */
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  int pfn_valid(unsigned long pfn)
> @@ -132,10 +148,15 @@ static void arm64_memory_present(void)
>  static void arm64_memory_present(void)
>  {
>         struct memblock_region *reg;
> +       int nid = 0;
>
> -       for_each_memblock(memory, reg)
> -               memory_present(0, memblock_region_memory_base_pfn(reg),
> -                              memblock_region_memory_end_pfn(reg));
> +       for_each_memblock(memory, reg) {
> +#ifdef CONFIG_NUMA
> +               nid = reg->nid;
> +#endif
> +               memory_present(nid, memblock_region_memory_base_pfn(reg),
> +                               memblock_region_memory_end_pfn(reg));
> +       }
>  }
>  #endif
>
> @@ -200,6 +221,10 @@ void __init bootmem_init(void)
>         min = PFN_UP(memblock_start_of_DRAM());
>         max = PFN_DOWN(memblock_end_of_DRAM());
>
> +       high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
> +       max_pfn = max_low_pfn = max;
> +
> +       arm64_numa_init();
>         /*
>          * Sparsemem tries to allocate bootmem in memory_present(), so must be
>          * done after the fixed reservations.
> @@ -208,9 +233,6 @@ void __init bootmem_init(void)
>
>         sparse_init();
>         zone_sizes_init(min, max);
> -
> -       high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
> -       max_pfn = max_low_pfn = max;
>  }
>
>  #ifndef CONFIG_SPARSEMEM_VMEMMAP
> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> new file mode 100644
> index 0000000..2be83de
> --- /dev/null
> +++ b/arch/arm64/mm/numa.c
> @@ -0,0 +1,550 @@
> +/*
> + * NUMA support, based on the x86 implementation.
> + *
> + * Copyright (C) 2015 Cavium Inc.
> + * Author: Ganapatrao Kulkarni <gkulkarni-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/mm.h>
> +#include <linux/string.h>
> +#include <linux/init.h>
> +#include <linux/bootmem.h>
> +#include <linux/memblock.h>
> +#include <linux/ctype.h>
> +#include <linux/module.h>
> +#include <linux/nodemask.h>
> +#include <linux/sched.h>
> +#include <linux/topology.h>
> +#include <linux/mmzone.h>
> +
> +#include <asm/smp_plat.h>
> +
> +int __initdata numa_off;
> +nodemask_t numa_nodes_parsed __initdata;
> +static int numa_distance_cnt;
> +static u8 *numa_distance;
> +static u8 dummy_numa_enabled;
> +
> +struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
> +EXPORT_SYMBOL(node_data);
> +
> +struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
> +static struct numa_meminfo numa_meminfo;
> +
> +static __init int numa_setup(char *opt)
> +{
> +       if (!opt)
> +               return -EINVAL;
> +       if (!strncmp(opt, "off", 3)) {
> +               pr_info("%s\n", "NUMA turned off");
> +               numa_off = 1;
> +       }
> +       return 0;
> +}
> +early_param("numa", numa_setup);
> +
> +cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
> +EXPORT_SYMBOL(node_to_cpumask_map);
> +
> +int cpu_to_node_map[NR_CPUS];
> +EXPORT_SYMBOL(cpu_to_node_map);
> +
> +/*
> + * Returns a pointer to the bitmask of CPUs on Node 'node'.
> + */
> +const struct cpumask *cpumask_of_node(int node)
> +{
> +       if (node >= nr_node_ids) {
> +               pr_warn("cpumask_of_node(%d): node > nr_node_ids(%d)\n",
> +                       node, nr_node_ids);
> +               dump_stack();
> +               return cpu_none_mask;
> +       }
> +       if (node_to_cpumask_map[node] == NULL) {
> +               pr_warn("cpumask_of_node(%d): no node_to_cpumask_map!\n",
> +                       node);
> +               dump_stack();
> +               return cpu_online_mask;
> +       }
> +       return node_to_cpumask_map[node];
> +}
> +EXPORT_SYMBOL(cpumask_of_node);
> +
> +void numa_clear_node(int cpu)
> +{
> +       set_cpu_numa_node(cpu, NUMA_NO_NODE);
> +}
> +
> +void map_cpu_to_node(int cpu, int nid)
> +{
> +       if (nid < 0) { /* just initialize by zero */
> +               cpu_to_node_map[cpu] = 0;
> +               return;
> +       }
> +
> +       cpu_to_node_map[cpu] = nid;
> +       cpumask_set_cpu(cpu, node_to_cpumask_map[nid]);
> +       set_numa_node(nid);
> +}
> +
> +/**
> + * build_cpu_to_node_map - setup cpu to node and node to cpumask arrays
> + *
> + * Build cpu to node mapping and initialize the per node cpu masks using
> + * info from the node_cpuid array handed to us by ACPI or DT.
> + */
> +void __init build_cpu_to_node_map(void)
> +{
> +       int cpu, i, node;
> +
> +       for (node = 0; node < MAX_NUMNODES; node++)
> +               cpumask_clear(node_to_cpumask_map[node]);
> +
> +       for_each_possible_cpu(cpu) {
> +               node = NUMA_NO_NODE;
> +               for_each_possible_cpu(i) {
> +                       if (cpu_logical_map(cpu) == node_cpu_hwid[i].cpu_hwid) {
> +                               node = node_cpu_hwid[i].node_id;
> +                               break;
> +                       }
> +               }
> +               map_cpu_to_node(cpu, node);
> +       }
> +}
> +/*
> + * Allocate node_to_cpumask_map based on number of available nodes
> + * Requires node_possible_map to be valid.
> + *
> + * Note: cpumask_of_node() is not valid until after this is done.
> + * (Use CONFIG_DEBUG_PER_CPU_MAPS to check this.)
> + */
> +void __init setup_node_to_cpumask_map(void)
> +{
> +       unsigned int node;
> +
> +       /* setup nr_node_ids if not done yet */
> +       if (nr_node_ids == MAX_NUMNODES)
> +               setup_nr_node_ids();
> +
> +       /* allocate the map */
> +       for (node = 0; node < nr_node_ids; node++)
> +               alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]);
> +
> +       /* cpumask_of_node() will now work */
> +       pr_debug("Node to cpumask map for %d nodes\n", nr_node_ids);
> +}
> +
> +/*
> + *  Set the cpu to node and mem mapping
> + */
> +void numa_store_cpu_info(int cpu)
> +{
> +       if (dummy_numa_enabled) {
> +               /* set to default */
> +               node_cpu_hwid[cpu].node_id  =  0;
> +               node_cpu_hwid[cpu].cpu_hwid = cpu_logical_map(cpu);
> +       }
> +       map_cpu_to_node(cpu, node_cpu_hwid[cpu].node_id);
> +}
> +
> +/**
> + * numa_add_memblk_to - Add one numa_memblk to a numa_meminfo
> + */
> +
> +static int __init numa_add_memblk_to(int nid, u64 start, u64 end,
> +                                    struct numa_meminfo *mi)
> +{
> +       /* ignore zero length blks */
> +       if (start == end)
> +               return 0;
> +
> +       /* whine about and ignore invalid blks */
> +       if (start > end || nid < 0 || nid >= MAX_NUMNODES) {
> +               pr_warn("NUMA: Warning: invalid memblk node %d [mem %#010Lx-%#010Lx]\n",
> +                               nid, start, end - 1);
> +               return 0;
> +       }
> +
> +       if (mi->nr_blks >= NR_NODE_MEMBLKS) {
> +               pr_err("NUMA: too many memblk ranges\n");
> +               return -EINVAL;
> +       }
> +
> +       pr_info("NUMA: Adding memblock %d [0x%llx - 0x%llx] on node %d\n",
> +                       mi->nr_blks, start, end, nid);
> +       mi->blk[mi->nr_blks].start = start;
> +       mi->blk[mi->nr_blks].end = end;
> +       mi->blk[mi->nr_blks].nid = nid;
> +       mi->nr_blks++;
> +       return 0;
> +}
> +
> +/**
> + * numa_add_memblk - Add one numa_memblk to numa_meminfo
> + * @nid: NUMA node ID of the new memblk
> + * @start: Start address of the new memblk
> + * @end: End address of the new memblk
> + *
> + * Add a new memblk to the default numa_meminfo.
> + *
> + * RETURNS:
> + * 0 on success, -errno on failure.
> + */
> +#define MAX_PHYS_ADDR  ((phys_addr_t)~0)
> +
> +int __init numa_add_memblk(u32 nid, u64 base, u64 end)
> +{
> +       const u64 phys_offset = __pa(PAGE_OFFSET);
> +
> +       base &= PAGE_MASK;
> +       end &= PAGE_MASK;
> +
> +       if (base > MAX_PHYS_ADDR) {
> +               pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
> +                               base, base + end);
> +               return -ENOMEM;
> +       }
> +
> +       if (base + end > MAX_PHYS_ADDR) {
> +               pr_info("NUMA: Ignoring memory range 0x%lx - 0x%llx\n",
> +                               ULONG_MAX, base + end);
> +               end = MAX_PHYS_ADDR - base;
> +       }
> +
> +       if (base + end < phys_offset) {
> +               pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
> +                          base, base + end);
> +               return -ENOMEM;
> +       }
> +       if (base < phys_offset) {
> +               pr_info("NUMA: Ignoring memory range 0x%llx - 0x%llx\n",
> +                          base, phys_offset);
> +               end -= phys_offset - base;
> +               base = phys_offset;
> +       }
> +
> +       return numa_add_memblk_to(nid, base, base + end, &numa_meminfo);
> +}
> +EXPORT_SYMBOL(numa_add_memblk);
> +
> +/* Initialize NODE_DATA for a node on the local memory */
> +static void __init setup_node_data(int nid, u64 start, u64 end)
> +{
> +       const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
> +       u64 nd_pa;
> +       void *nd;
> +       int tnid;
> +
> +       start = roundup(start, ZONE_ALIGN);
> +
> +       pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n",
> +              nid, start, end - 1);
> +
> +       /*
> +        * Allocate node data.  Try node-local memory and then any node.
> +        */
> +       nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
> +       if (!nd_pa) {
> +               nd_pa = __memblock_alloc_base(nd_size, SMP_CACHE_BYTES,
> +                                             MEMBLOCK_ALLOC_ACCESSIBLE);
> +               if (!nd_pa) {
> +                       pr_err("Cannot find %zu bytes in node %d\n",
> +                              nd_size, nid);
> +                       return;
> +               }
> +       }
> +       nd = __va(nd_pa);
> +
> +       /* report and initialize */
> +       pr_info("  NODE_DATA [mem %#010Lx-%#010Lx]\n",
> +              nd_pa, nd_pa + nd_size - 1);
> +       tnid = early_pfn_to_nid(nd_pa >> PAGE_SHIFT);
> +       if (tnid != nid)
> +               pr_info("    NODE_DATA(%d) on node %d\n", nid, tnid);
> +
> +       node_data[nid] = nd;
> +       memset(NODE_DATA(nid), 0, sizeof(pg_data_t));
> +       NODE_DATA(nid)->node_id = nid;
> +       NODE_DATA(nid)->node_start_pfn = start >> PAGE_SHIFT;
> +       NODE_DATA(nid)->node_spanned_pages = (end - start) >> PAGE_SHIFT;
> +
> +       node_set_online(nid);
> +}
> +
> +/*
> + * Set nodes, which have memory in @mi, in *@nodemask.
> + */
> +static void __init numa_nodemask_from_meminfo(nodemask_t *nodemask,
> +                                             const struct numa_meminfo *mi)
> +{
> +       int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(mi->blk); i++)
> +               if (mi->blk[i].start != mi->blk[i].end &&
> +                   mi->blk[i].nid != NUMA_NO_NODE)
> +                       node_set(mi->blk[i].nid, *nodemask);
> +}
> +
> +/*
> + * Sanity check to catch more bad NUMA configurations (they are amazingly
> + * common).  Make sure the nodes cover all memory.
> + */
> +static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
> +{
> +       u64 numaram, totalram;
> +       int i;
> +
> +       numaram = 0;
> +       for (i = 0; i < mi->nr_blks; i++) {
> +               u64 s = mi->blk[i].start >> PAGE_SHIFT;
> +               u64 e = mi->blk[i].end >> PAGE_SHIFT;
> +
> +               numaram += e - s;
> +               numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
> +               if ((s64)numaram < 0)
> +                       numaram = 0;
> +       }
> +
> +       totalram = max_pfn - absent_pages_in_range(0, max_pfn);
> +
> +       /* We seem to lose 3 pages somewhere. Allow 1M of slack. */
> +       if ((s64)(totalram - numaram) >= (1 << (20 - PAGE_SHIFT))) {
> +               pr_err("NUMA: nodes only cover %lluMB of your %lluMB Total RAM. Not used.\n",
> +                      (numaram << PAGE_SHIFT) >> 20,
> +                      (totalram << PAGE_SHIFT) >> 20);
> +               return false;
> +       }
> +       return true;
> +}
> +
> +/**
> + * numa_reset_distance - Reset NUMA distance table
> + *
> + * The current table is freed.  The next numa_set_distance() call will
> + * create a new one.
> + */
> +void __init numa_reset_distance(void)
> +{
> +       size_t size = numa_distance_cnt * numa_distance_cnt *
> +               sizeof(numa_distance[0]);
> +
> +       /* numa_distance could be 1LU marking allocation failure, test cnt */
> +       if (numa_distance_cnt)
> +               memblock_free(__pa(numa_distance), size);
> +       numa_distance_cnt = 0;
> +       numa_distance = NULL;   /* enable table creation */
> +}
> +
> +static int __init numa_alloc_distance(void)
> +{
> +       nodemask_t nodes_parsed;
> +       size_t size;
> +       int i, j, cnt = 0;
> +       u64 phys;
> +
> +       /* size the new table and allocate it */
> +       nodes_parsed = numa_nodes_parsed;
> +       numa_nodemask_from_meminfo(&nodes_parsed, &numa_meminfo);
> +
> +       for_each_node_mask(i, nodes_parsed)
> +               cnt = i;
> +       cnt++;
> +       size = cnt * cnt * sizeof(numa_distance[0]);
> +
> +       phys = memblock_find_in_range(0, PFN_PHYS(max_pfn),
> +                                     size, PAGE_SIZE);
> +       if (!phys) {
> +               pr_warn("NUMA: Warning: can't allocate distance table!\n");
> +               /* don't retry until explicitly reset */
> +               numa_distance = (void *)1LU;
> +               return -ENOMEM;
> +       }
> +       memblock_reserve(phys, size);
> +
> +       numa_distance = __va(phys);
> +       numa_distance_cnt = cnt;
> +
> +       /* fill with the default distances */
> +       for (i = 0; i < cnt; i++)
> +               for (j = 0; j < cnt; j++)
> +                       numa_distance[i * cnt + j] = i == j ?
> +                               LOCAL_DISTANCE : REMOTE_DISTANCE;
> +       pr_debug("NUMA: Initialized distance table, cnt=%d\n", cnt);
> +
> +       return 0;
> +}
> +
> +/**
> + * numa_set_distance - Set NUMA distance from one NUMA to another
> + * @from: the 'from' node to set distance
> + * @to: the 'to'  node to set distance
> + * @distance: NUMA distance
> + *
> + * Set the distance from node @from to @to to @distance.  If distance table
> + * doesn't exist, one which is large enough to accommodate all the currently
> + * known nodes will be created.
> + *
> + * If such table cannot be allocated, a warning is printed and further
> + * calls are ignored until the distance table is reset with
> + * numa_reset_distance().
> + *
> + * If @from or @to is higher than the highest known node or lower than zero
> + * at the time of table creation or @distance doesn't make sense, the call
> + * is ignored.
> + * This is to allow simplification of specific NUMA config implementations.
> + */
> +void __init numa_set_distance(int from, int to, int distance)
> +{
> +       if (!numa_distance && numa_alloc_distance() < 0)
> +               return;
> +
> +       if (from >= numa_distance_cnt || to >= numa_distance_cnt ||
> +                       from < 0 || to < 0) {
> +               pr_warn_once("NUMA: Warning: node ids are out of bound, from=%d to=%d distance=%d\n",
> +                           from, to, distance);
> +               return;
> +       }
> +
> +       if ((u8)distance != distance ||
> +           (from == to && distance != LOCAL_DISTANCE)) {
> +               pr_warn_once("NUMA: Warning: invalid distance parameter, from=%d to=%d distance=%d\n",
> +                            from, to, distance);
> +               return;
> +       }
> +
> +       numa_distance[from * numa_distance_cnt + to] = distance;
> +}
> +EXPORT_SYMBOL(numa_set_distance);
> +
> +int __node_distance(int from, int to)
> +{
> +       if (from >= numa_distance_cnt || to >= numa_distance_cnt)
> +               return from == to ? LOCAL_DISTANCE : REMOTE_DISTANCE;
> +       return numa_distance[from * numa_distance_cnt + to];
> +}
> +EXPORT_SYMBOL(__node_distance);
> +
> +static int __init numa_register_memblks(struct numa_meminfo *mi)
> +{
> +       unsigned long uninitialized_var(pfn_align);
> +       int i, nid;
> +
> +       /* Account for nodes with cpus and no memory */
> +       node_possible_map = numa_nodes_parsed;
> +       numa_nodemask_from_meminfo(&node_possible_map, mi);
> +       if (WARN_ON(nodes_empty(node_possible_map)))
> +               return -EINVAL;
> +
> +       for (i = 0; i < mi->nr_blks; i++) {
> +               struct numa_memblk *mb = &mi->blk[i];
> +
> +               memblock_set_node(mb->start, mb->end - mb->start,
> +                                 &memblock.memory, mb->nid);
> +       }
> +
> +       /*
> +        * If sections array is gonna be used for pfn -> nid mapping, check
> +        * whether its granularity is fine enough.
> +        */
> +#ifdef NODE_NOT_IN_PAGE_FLAGS
> +       pfn_align = node_map_pfn_alignment();
> +       if (pfn_align && pfn_align < PAGES_PER_SECTION) {
> +               pr_warn("Node alignment %lluMB < min %lluMB, rejecting NUMA config\n",
> +                      PFN_PHYS(pfn_align) >> 20,
> +                      PFN_PHYS(PAGES_PER_SECTION) >> 20);
> +               return -EINVAL;
> +       }
> +#endif
> +       if (!numa_meminfo_cover_memory(mi))
> +               return -EINVAL;
> +
> +       /* Finally register nodes. */
> +       for_each_node_mask(nid, node_possible_map) {
> +               u64 start = PFN_PHYS(max_pfn);
> +               u64 end = 0;
> +
> +               for (i = 0; i < mi->nr_blks; i++) {
> +                       if (nid != mi->blk[i].nid)
> +                               continue;
> +                       start = min(mi->blk[i].start, start);
> +                       end = max(mi->blk[i].end, end);
> +               }
> +
> +               if (start < end)
> +                       setup_node_data(nid, start, end);
> +       }
> +
> +       /* Dump memblock with node info and return. */
> +       memblock_dump_all();
> +       return 0;
> +}
> +
> +static int __init numa_init(int (*init_func)(void))
> +{
> +       int ret, i;
> +
> +       nodes_clear(numa_nodes_parsed);
> +       nodes_clear(node_possible_map);
> +       nodes_clear(node_online_map);
> +
> +       ret = init_func();
> +       if (ret < 0)
> +               return ret;
> +
> +       ret = numa_register_memblks(&numa_meminfo);
> +       if (ret < 0)
> +               return ret;
> +
> +       for (i = 0; i < nr_cpu_ids; i++)
> +               numa_clear_node(i);
> +
> +       setup_node_to_cpumask_map();
> +       build_cpu_to_node_map();
> +       return 0;
> +}
> +
> +/**
> + * dummy_numa_init - Fallback dummy NUMA init
> + *
> + * Used if there's no underlying NUMA architecture, NUMA initialization
> + * fails, or NUMA is disabled on the command line.
> + *
> + * Must online at least one node and add memory blocks that cover all
> + * allowed memory.  This function must not fail.
> + */
> +static int __init dummy_numa_init(void)
> +{
> +       pr_info("%s\n", "No NUMA configuration found");
> +       pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n",
> +              0LLU, PFN_PHYS(max_pfn) - 1);
> +       node_set(0, numa_nodes_parsed);
> +       numa_add_memblk(0, 0, PFN_PHYS(max_pfn));
> +       dummy_numa_enabled = 1;
> +
> +       return 0;
> +}
> +
> +/**
> + * arm64_numa_init - Initialize NUMA
> + *
> + * Try each configured NUMA initialization method until one succeeds.  The
> + * last fallback is dummy single node config encomapssing whole memory and
> + * never fails.
> + */
> +void __init arm64_numa_init(void)
> +{
> +       numa_init(dummy_numa_init);
> +}
> --
> 1.8.1.4
>
can you please review this patch.

thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 1/4] arm64, numa: adding numa support for arm64 platforms.
@ 2015-09-03  9:52       ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-03  9:52 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin/Will,

On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
<gkulkarni@caviumnetworks.com> wrote:
> Adding numa support for arm64 based platforms.
> This patch adds by default the dummy numa node and
> maps all memory and cpus to node 0.
> using this patch, numa can be simulated on single node arm64 platforms.
>
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
> ---
>  arch/arm64/Kconfig              |  26 ++
>  arch/arm64/include/asm/mmzone.h |  32 +++
>  arch/arm64/include/asm/numa.h   |  42 +++
>  arch/arm64/kernel/setup.c       |   9 +
>  arch/arm64/kernel/smp.c         |   2 +
>  arch/arm64/mm/Makefile          |   1 +
>  arch/arm64/mm/init.c            |  34 ++-
>  arch/arm64/mm/numa.c            | 550 ++++++++++++++++++++++++++++++++++++++++
>  8 files changed, 690 insertions(+), 6 deletions(-)
>  create mode 100644 arch/arm64/include/asm/mmzone.h
>  create mode 100644 arch/arm64/include/asm/numa.h
>  create mode 100644 arch/arm64/mm/numa.c
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 43a0c26..fa37a5d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -72,6 +72,7 @@ config ARM64
>         select HAVE_PERF_USER_STACK_DUMP
>         select HAVE_RCU_TABLE_FREE
>         select HAVE_SYSCALL_TRACEPOINTS
> +       select HAVE_MEMBLOCK_NODE_MAP if NUMA
>         select IRQ_DOMAIN
>         select IRQ_FORCED_THREADING
>         select MODULES_USE_ELF_RELA
> @@ -559,6 +560,31 @@ config HOTPLUG_CPU
>           Say Y here to experiment with turning CPUs off and on.  CPUs
>           can be controlled through /sys/devices/system/cpu.
>
> +# Common NUMA Features
> +config NUMA
> +       bool "Numa Memory Allocation and Scheduler Support"
> +       depends on SMP
> +       help
> +         Enable NUMA (Non Uniform Memory Access) support.
> +
> +         The kernel will try to allocate memory used by a CPU on the
> +         local memory controller of the CPU and add some more
> +         NUMA awareness to the kernel.
> +
> +config NODES_SHIFT
> +       int "Maximum NUMA Nodes (as a power of 2)"
> +       range 1 10
> +       default "2"
> +       depends on NEED_MULTIPLE_NODES
> +       help
> +         Specify the maximum number of NUMA Nodes available on the target
> +         system.  Increases memory reserved to accommodate various tables.
> +
> +config USE_PERCPU_NUMA_NODE_ID
> +       def_bool y
> +       depends on NUMA
> +
> +
>  source kernel/Kconfig.preempt
>
>  config UP_LATE_INIT
> diff --git a/arch/arm64/include/asm/mmzone.h b/arch/arm64/include/asm/mmzone.h
> new file mode 100644
> index 0000000..d27ee66
> --- /dev/null
> +++ b/arch/arm64/include/asm/mmzone.h
> @@ -0,0 +1,32 @@
> +#ifndef __ASM_ARM64_MMZONE_H_
> +#define __ASM_ARM64_MMZONE_H_
> +
> +#ifdef CONFIG_NUMA
> +
> +#include <linux/mmdebug.h>
> +#include <asm/smp.h>
> +#include <linux/types.h>
> +#include <asm/numa.h>
> +
> +extern struct pglist_data *node_data[];
> +
> +#define NODE_DATA(nid)         (node_data[nid])
> +
> +
> +struct numa_memblk {
> +       u64                     start;
> +       u64                     end;
> +       int                     nid;
> +};
> +
> +struct numa_meminfo {
> +       int                     nr_blks;
> +       struct numa_memblk      blk[NR_NODE_MEMBLKS];
> +};
> +
> +void __init numa_remove_memblk_from(int idx, struct numa_meminfo *mi);
> +int __init numa_cleanup_meminfo(struct numa_meminfo *mi);
> +void __init numa_reset_distance(void);
> +
> +#endif /* CONFIG_NUMA */
> +#endif /* __ASM_ARM64_MMZONE_H_ */
> diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h
> new file mode 100644
> index 0000000..59b834e
> --- /dev/null
> +++ b/arch/arm64/include/asm/numa.h
> @@ -0,0 +1,42 @@
> +#ifndef _ASM_NUMA_H
> +#define _ASM_NUMA_H
> +
> +#include <linux/nodemask.h>
> +#include <asm/topology.h>
> +
> +#ifdef CONFIG_NUMA
> +
> +#define NR_NODE_MEMBLKS                (MAX_NUMNODES * 2)
> +#define ZONE_ALIGN (1UL << (MAX_ORDER + PAGE_SHIFT))
> +
> +/* currently, arm64 implements flat NUMA topology */
> +#define parent_node(node)      (node)
> +
> +/* dummy definitions for pci functions */
> +#define pcibus_to_node(node)   0
> +#define cpumask_of_pcibus(bus) 0
> +
> +struct __node_cpu_hwid {
> +       u32 node_id;    /* logical node containing this CPU */
> +       u64 cpu_hwid;   /* MPIDR for this CPU */
> +};
> +
> +extern struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
> +extern nodemask_t numa_nodes_parsed __initdata;
> +
> +const struct cpumask *cpumask_of_node(int node);
> +/* Mappings between node number and cpus on that node. */
> +extern cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
> +
> +void __init arm64_numa_init(void);
> +int __init numa_add_memblk(u32 nodeid, u64 start, u64 end);
> +void numa_store_cpu_info(int cpu);
> +void __init build_cpu_to_node_map(void);
> +void __init numa_set_distance(int from, int to, int distance);
> +#else  /* CONFIG_NUMA */
> +static inline void numa_store_cpu_info(int cpu)                { }
> +static inline void arm64_numa_init(void)               { }
> +static inline void build_cpu_to_node_map(void) { }
> +static inline void numa_set_distance(int from, int to, int distance) { }
> +#endif /* CONFIG_NUMA */
> +#endif /* _ASM_NUMA_H */
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 4d4d7ce..6e101eb 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -65,6 +65,7 @@
>  #include <asm/efi.h>
>  #include <asm/virt.h>
>  #include <asm/xen/hypervisor.h>
> +#include <asm/numa.h>
>
>  unsigned long elf_hwcap __read_mostly;
>  EXPORT_SYMBOL_GPL(elf_hwcap);
> @@ -439,6 +440,9 @@ static int __init topology_init(void)
>  {
>         int i;
>
> +       for_each_online_node(i)
> +               register_one_node(i);
> +
>         for_each_possible_cpu(i) {
>                 struct cpu *cpu = &per_cpu(cpu_data.cpu, i);
>                 cpu->hotpluggable = 1;
> @@ -511,7 +515,12 @@ static int c_show(struct seq_file *m, void *v)
>                  * "processor".  Give glibc what it expects.
>                  */
>  #ifdef CONFIG_SMP
> +       if (IS_ENABLED(CONFIG_NUMA)) {
> +               seq_printf(m, "processor\t: %d", i);
> +               seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
> +       } else {
>                 seq_printf(m, "processor\t: %d\n", i);
> +       }
>  #endif
>
>                 /*
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 50fb469..ae3e02c 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -52,6 +52,7 @@
>  #include <asm/sections.h>
>  #include <asm/tlbflush.h>
>  #include <asm/ptrace.h>
> +#include <asm/numa.h>
>
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/ipi.h>
> @@ -124,6 +125,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  static void smp_store_cpu_info(unsigned int cpuid)
>  {
>         store_cpu_topology(cpuid);
> +       numa_store_cpu_info(cpuid);
>  }
>
>  /*
> diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> index 773d37a..bb92d41 100644
> --- a/arch/arm64/mm/Makefile
> +++ b/arch/arm64/mm/Makefile
> @@ -4,3 +4,4 @@ obj-y                           := dma-mapping.o extable.o fault.o init.o \
>                                    context.o proc.o pageattr.o
>  obj-$(CONFIG_HUGETLB_PAGE)     += hugetlbpage.o
>  obj-$(CONFIG_ARM64_PTDUMP)     += dump.o
> +obj-$(CONFIG_NUMA)             += numa.o
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 54da32e..cab384b 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -42,6 +42,7 @@
>  #include <asm/sizes.h>
>  #include <asm/tlb.h>
>  #include <asm/alternative.h>
> +#include <asm/numa.h>
>
>  #include "mm.h"
>
> @@ -77,6 +78,20 @@ static phys_addr_t max_zone_dma_phys(void)
>         return min(offset + (1ULL << 32), memblock_end_of_DRAM());
>  }
>
> +#ifdef CONFIG_NUMA
> +static void __init zone_sizes_init(unsigned long min, unsigned long max)
> +{
> +       unsigned long max_zone_pfns[MAX_NR_ZONES];
> +
> +       memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
> +       if (IS_ENABLED(CONFIG_ZONE_DMA))
> +               max_zone_pfns[ZONE_DMA] = PFN_DOWN(max_zone_dma_phys());
> +       max_zone_pfns[ZONE_NORMAL] = max;
> +
> +       free_area_init_nodes(max_zone_pfns);
> +}
> +
> +#else
>  static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  {
>         struct memblock_region *reg;
> @@ -115,6 +130,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>
>         free_area_init_node(0, zone_size, min, zhole_size);
>  }
> +#endif /* CONFIG_NUMA */
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  int pfn_valid(unsigned long pfn)
> @@ -132,10 +148,15 @@ static void arm64_memory_present(void)
>  static void arm64_memory_present(void)
>  {
>         struct memblock_region *reg;
> +       int nid = 0;
>
> -       for_each_memblock(memory, reg)
> -               memory_present(0, memblock_region_memory_base_pfn(reg),
> -                              memblock_region_memory_end_pfn(reg));
> +       for_each_memblock(memory, reg) {
> +#ifdef CONFIG_NUMA
> +               nid = reg->nid;
> +#endif
> +               memory_present(nid, memblock_region_memory_base_pfn(reg),
> +                               memblock_region_memory_end_pfn(reg));
> +       }
>  }
>  #endif
>
> @@ -200,6 +221,10 @@ void __init bootmem_init(void)
>         min = PFN_UP(memblock_start_of_DRAM());
>         max = PFN_DOWN(memblock_end_of_DRAM());
>
> +       high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
> +       max_pfn = max_low_pfn = max;
> +
> +       arm64_numa_init();
>         /*
>          * Sparsemem tries to allocate bootmem in memory_present(), so must be
>          * done after the fixed reservations.
> @@ -208,9 +233,6 @@ void __init bootmem_init(void)
>
>         sparse_init();
>         zone_sizes_init(min, max);
> -
> -       high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
> -       max_pfn = max_low_pfn = max;
>  }
>
>  #ifndef CONFIG_SPARSEMEM_VMEMMAP
> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> new file mode 100644
> index 0000000..2be83de
> --- /dev/null
> +++ b/arch/arm64/mm/numa.c
> @@ -0,0 +1,550 @@
> +/*
> + * NUMA support, based on the x86 implementation.
> + *
> + * Copyright (C) 2015 Cavium Inc.
> + * Author: Ganapatrao Kulkarni <gkulkarni@cavium.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/mm.h>
> +#include <linux/string.h>
> +#include <linux/init.h>
> +#include <linux/bootmem.h>
> +#include <linux/memblock.h>
> +#include <linux/ctype.h>
> +#include <linux/module.h>
> +#include <linux/nodemask.h>
> +#include <linux/sched.h>
> +#include <linux/topology.h>
> +#include <linux/mmzone.h>
> +
> +#include <asm/smp_plat.h>
> +
> +int __initdata numa_off;
> +nodemask_t numa_nodes_parsed __initdata;
> +static int numa_distance_cnt;
> +static u8 *numa_distance;
> +static u8 dummy_numa_enabled;
> +
> +struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
> +EXPORT_SYMBOL(node_data);
> +
> +struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
> +static struct numa_meminfo numa_meminfo;
> +
> +static __init int numa_setup(char *opt)
> +{
> +       if (!opt)
> +               return -EINVAL;
> +       if (!strncmp(opt, "off", 3)) {
> +               pr_info("%s\n", "NUMA turned off");
> +               numa_off = 1;
> +       }
> +       return 0;
> +}
> +early_param("numa", numa_setup);
> +
> +cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
> +EXPORT_SYMBOL(node_to_cpumask_map);
> +
> +int cpu_to_node_map[NR_CPUS];
> +EXPORT_SYMBOL(cpu_to_node_map);
> +
> +/*
> + * Returns a pointer to the bitmask of CPUs on Node 'node'.
> + */
> +const struct cpumask *cpumask_of_node(int node)
> +{
> +       if (node >= nr_node_ids) {
> +               pr_warn("cpumask_of_node(%d): node > nr_node_ids(%d)\n",
> +                       node, nr_node_ids);
> +               dump_stack();
> +               return cpu_none_mask;
> +       }
> +       if (node_to_cpumask_map[node] == NULL) {
> +               pr_warn("cpumask_of_node(%d): no node_to_cpumask_map!\n",
> +                       node);
> +               dump_stack();
> +               return cpu_online_mask;
> +       }
> +       return node_to_cpumask_map[node];
> +}
> +EXPORT_SYMBOL(cpumask_of_node);
> +
> +void numa_clear_node(int cpu)
> +{
> +       set_cpu_numa_node(cpu, NUMA_NO_NODE);
> +}
> +
> +void map_cpu_to_node(int cpu, int nid)
> +{
> +       if (nid < 0) { /* just initialize by zero */
> +               cpu_to_node_map[cpu] = 0;
> +               return;
> +       }
> +
> +       cpu_to_node_map[cpu] = nid;
> +       cpumask_set_cpu(cpu, node_to_cpumask_map[nid]);
> +       set_numa_node(nid);
> +}
> +
> +/**
> + * build_cpu_to_node_map - setup cpu to node and node to cpumask arrays
> + *
> + * Build cpu to node mapping and initialize the per node cpu masks using
> + * info from the node_cpuid array handed to us by ACPI or DT.
> + */
> +void __init build_cpu_to_node_map(void)
> +{
> +       int cpu, i, node;
> +
> +       for (node = 0; node < MAX_NUMNODES; node++)
> +               cpumask_clear(node_to_cpumask_map[node]);
> +
> +       for_each_possible_cpu(cpu) {
> +               node = NUMA_NO_NODE;
> +               for_each_possible_cpu(i) {
> +                       if (cpu_logical_map(cpu) == node_cpu_hwid[i].cpu_hwid) {
> +                               node = node_cpu_hwid[i].node_id;
> +                               break;
> +                       }
> +               }
> +               map_cpu_to_node(cpu, node);
> +       }
> +}
> +/*
> + * Allocate node_to_cpumask_map based on number of available nodes
> + * Requires node_possible_map to be valid.
> + *
> + * Note: cpumask_of_node() is not valid until after this is done.
> + * (Use CONFIG_DEBUG_PER_CPU_MAPS to check this.)
> + */
> +void __init setup_node_to_cpumask_map(void)
> +{
> +       unsigned int node;
> +
> +       /* setup nr_node_ids if not done yet */
> +       if (nr_node_ids == MAX_NUMNODES)
> +               setup_nr_node_ids();
> +
> +       /* allocate the map */
> +       for (node = 0; node < nr_node_ids; node++)
> +               alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]);
> +
> +       /* cpumask_of_node() will now work */
> +       pr_debug("Node to cpumask map for %d nodes\n", nr_node_ids);
> +}
> +
> +/*
> + *  Set the cpu to node and mem mapping
> + */
> +void numa_store_cpu_info(int cpu)
> +{
> +       if (dummy_numa_enabled) {
> +               /* set to default */
> +               node_cpu_hwid[cpu].node_id  =  0;
> +               node_cpu_hwid[cpu].cpu_hwid = cpu_logical_map(cpu);
> +       }
> +       map_cpu_to_node(cpu, node_cpu_hwid[cpu].node_id);
> +}
> +
> +/**
> + * numa_add_memblk_to - Add one numa_memblk to a numa_meminfo
> + */
> +
> +static int __init numa_add_memblk_to(int nid, u64 start, u64 end,
> +                                    struct numa_meminfo *mi)
> +{
> +       /* ignore zero length blks */
> +       if (start == end)
> +               return 0;
> +
> +       /* whine about and ignore invalid blks */
> +       if (start > end || nid < 0 || nid >= MAX_NUMNODES) {
> +               pr_warn("NUMA: Warning: invalid memblk node %d [mem %#010Lx-%#010Lx]\n",
> +                               nid, start, end - 1);
> +               return 0;
> +       }
> +
> +       if (mi->nr_blks >= NR_NODE_MEMBLKS) {
> +               pr_err("NUMA: too many memblk ranges\n");
> +               return -EINVAL;
> +       }
> +
> +       pr_info("NUMA: Adding memblock %d [0x%llx - 0x%llx] on node %d\n",
> +                       mi->nr_blks, start, end, nid);
> +       mi->blk[mi->nr_blks].start = start;
> +       mi->blk[mi->nr_blks].end = end;
> +       mi->blk[mi->nr_blks].nid = nid;
> +       mi->nr_blks++;
> +       return 0;
> +}
> +
> +/**
> + * numa_add_memblk - Add one numa_memblk to numa_meminfo
> + * @nid: NUMA node ID of the new memblk
> + * @start: Start address of the new memblk
> + * @end: End address of the new memblk
> + *
> + * Add a new memblk to the default numa_meminfo.
> + *
> + * RETURNS:
> + * 0 on success, -errno on failure.
> + */
> +#define MAX_PHYS_ADDR  ((phys_addr_t)~0)
> +
> +int __init numa_add_memblk(u32 nid, u64 base, u64 end)
> +{
> +       const u64 phys_offset = __pa(PAGE_OFFSET);
> +
> +       base &= PAGE_MASK;
> +       end &= PAGE_MASK;
> +
> +       if (base > MAX_PHYS_ADDR) {
> +               pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
> +                               base, base + end);
> +               return -ENOMEM;
> +       }
> +
> +       if (base + end > MAX_PHYS_ADDR) {
> +               pr_info("NUMA: Ignoring memory range 0x%lx - 0x%llx\n",
> +                               ULONG_MAX, base + end);
> +               end = MAX_PHYS_ADDR - base;
> +       }
> +
> +       if (base + end < phys_offset) {
> +               pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
> +                          base, base + end);
> +               return -ENOMEM;
> +       }
> +       if (base < phys_offset) {
> +               pr_info("NUMA: Ignoring memory range 0x%llx - 0x%llx\n",
> +                          base, phys_offset);
> +               end -= phys_offset - base;
> +               base = phys_offset;
> +       }
> +
> +       return numa_add_memblk_to(nid, base, base + end, &numa_meminfo);
> +}
> +EXPORT_SYMBOL(numa_add_memblk);
> +
> +/* Initialize NODE_DATA for a node on the local memory */
> +static void __init setup_node_data(int nid, u64 start, u64 end)
> +{
> +       const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
> +       u64 nd_pa;
> +       void *nd;
> +       int tnid;
> +
> +       start = roundup(start, ZONE_ALIGN);
> +
> +       pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n",
> +              nid, start, end - 1);
> +
> +       /*
> +        * Allocate node data.  Try node-local memory and then any node.
> +        */
> +       nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
> +       if (!nd_pa) {
> +               nd_pa = __memblock_alloc_base(nd_size, SMP_CACHE_BYTES,
> +                                             MEMBLOCK_ALLOC_ACCESSIBLE);
> +               if (!nd_pa) {
> +                       pr_err("Cannot find %zu bytes in node %d\n",
> +                              nd_size, nid);
> +                       return;
> +               }
> +       }
> +       nd = __va(nd_pa);
> +
> +       /* report and initialize */
> +       pr_info("  NODE_DATA [mem %#010Lx-%#010Lx]\n",
> +              nd_pa, nd_pa + nd_size - 1);
> +       tnid = early_pfn_to_nid(nd_pa >> PAGE_SHIFT);
> +       if (tnid != nid)
> +               pr_info("    NODE_DATA(%d) on node %d\n", nid, tnid);
> +
> +       node_data[nid] = nd;
> +       memset(NODE_DATA(nid), 0, sizeof(pg_data_t));
> +       NODE_DATA(nid)->node_id = nid;
> +       NODE_DATA(nid)->node_start_pfn = start >> PAGE_SHIFT;
> +       NODE_DATA(nid)->node_spanned_pages = (end - start) >> PAGE_SHIFT;
> +
> +       node_set_online(nid);
> +}
> +
> +/*
> + * Set nodes, which have memory in @mi, in *@nodemask.
> + */
> +static void __init numa_nodemask_from_meminfo(nodemask_t *nodemask,
> +                                             const struct numa_meminfo *mi)
> +{
> +       int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(mi->blk); i++)
> +               if (mi->blk[i].start != mi->blk[i].end &&
> +                   mi->blk[i].nid != NUMA_NO_NODE)
> +                       node_set(mi->blk[i].nid, *nodemask);
> +}
> +
> +/*
> + * Sanity check to catch more bad NUMA configurations (they are amazingly
> + * common).  Make sure the nodes cover all memory.
> + */
> +static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
> +{
> +       u64 numaram, totalram;
> +       int i;
> +
> +       numaram = 0;
> +       for (i = 0; i < mi->nr_blks; i++) {
> +               u64 s = mi->blk[i].start >> PAGE_SHIFT;
> +               u64 e = mi->blk[i].end >> PAGE_SHIFT;
> +
> +               numaram += e - s;
> +               numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
> +               if ((s64)numaram < 0)
> +                       numaram = 0;
> +       }
> +
> +       totalram = max_pfn - absent_pages_in_range(0, max_pfn);
> +
> +       /* We seem to lose 3 pages somewhere. Allow 1M of slack. */
> +       if ((s64)(totalram - numaram) >= (1 << (20 - PAGE_SHIFT))) {
> +               pr_err("NUMA: nodes only cover %lluMB of your %lluMB Total RAM. Not used.\n",
> +                      (numaram << PAGE_SHIFT) >> 20,
> +                      (totalram << PAGE_SHIFT) >> 20);
> +               return false;
> +       }
> +       return true;
> +}
> +
> +/**
> + * numa_reset_distance - Reset NUMA distance table
> + *
> + * The current table is freed.  The next numa_set_distance() call will
> + * create a new one.
> + */
> +void __init numa_reset_distance(void)
> +{
> +       size_t size = numa_distance_cnt * numa_distance_cnt *
> +               sizeof(numa_distance[0]);
> +
> +       /* numa_distance could be 1LU marking allocation failure, test cnt */
> +       if (numa_distance_cnt)
> +               memblock_free(__pa(numa_distance), size);
> +       numa_distance_cnt = 0;
> +       numa_distance = NULL;   /* enable table creation */
> +}
> +
> +static int __init numa_alloc_distance(void)
> +{
> +       nodemask_t nodes_parsed;
> +       size_t size;
> +       int i, j, cnt = 0;
> +       u64 phys;
> +
> +       /* size the new table and allocate it */
> +       nodes_parsed = numa_nodes_parsed;
> +       numa_nodemask_from_meminfo(&nodes_parsed, &numa_meminfo);
> +
> +       for_each_node_mask(i, nodes_parsed)
> +               cnt = i;
> +       cnt++;
> +       size = cnt * cnt * sizeof(numa_distance[0]);
> +
> +       phys = memblock_find_in_range(0, PFN_PHYS(max_pfn),
> +                                     size, PAGE_SIZE);
> +       if (!phys) {
> +               pr_warn("NUMA: Warning: can't allocate distance table!\n");
> +               /* don't retry until explicitly reset */
> +               numa_distance = (void *)1LU;
> +               return -ENOMEM;
> +       }
> +       memblock_reserve(phys, size);
> +
> +       numa_distance = __va(phys);
> +       numa_distance_cnt = cnt;
> +
> +       /* fill with the default distances */
> +       for (i = 0; i < cnt; i++)
> +               for (j = 0; j < cnt; j++)
> +                       numa_distance[i * cnt + j] = i == j ?
> +                               LOCAL_DISTANCE : REMOTE_DISTANCE;
> +       pr_debug("NUMA: Initialized distance table, cnt=%d\n", cnt);
> +
> +       return 0;
> +}
> +
> +/**
> + * numa_set_distance - Set NUMA distance from one NUMA to another
> + * @from: the 'from' node to set distance
> + * @to: the 'to'  node to set distance
> + * @distance: NUMA distance
> + *
> + * Set the distance from node @from to @to to @distance.  If distance table
> + * doesn't exist, one which is large enough to accommodate all the currently
> + * known nodes will be created.
> + *
> + * If such table cannot be allocated, a warning is printed and further
> + * calls are ignored until the distance table is reset with
> + * numa_reset_distance().
> + *
> + * If @from or @to is higher than the highest known node or lower than zero
> + * at the time of table creation or @distance doesn't make sense, the call
> + * is ignored.
> + * This is to allow simplification of specific NUMA config implementations.
> + */
> +void __init numa_set_distance(int from, int to, int distance)
> +{
> +       if (!numa_distance && numa_alloc_distance() < 0)
> +               return;
> +
> +       if (from >= numa_distance_cnt || to >= numa_distance_cnt ||
> +                       from < 0 || to < 0) {
> +               pr_warn_once("NUMA: Warning: node ids are out of bound, from=%d to=%d distance=%d\n",
> +                           from, to, distance);
> +               return;
> +       }
> +
> +       if ((u8)distance != distance ||
> +           (from == to && distance != LOCAL_DISTANCE)) {
> +               pr_warn_once("NUMA: Warning: invalid distance parameter, from=%d to=%d distance=%d\n",
> +                            from, to, distance);
> +               return;
> +       }
> +
> +       numa_distance[from * numa_distance_cnt + to] = distance;
> +}
> +EXPORT_SYMBOL(numa_set_distance);
> +
> +int __node_distance(int from, int to)
> +{
> +       if (from >= numa_distance_cnt || to >= numa_distance_cnt)
> +               return from == to ? LOCAL_DISTANCE : REMOTE_DISTANCE;
> +       return numa_distance[from * numa_distance_cnt + to];
> +}
> +EXPORT_SYMBOL(__node_distance);
> +
> +static int __init numa_register_memblks(struct numa_meminfo *mi)
> +{
> +       unsigned long uninitialized_var(pfn_align);
> +       int i, nid;
> +
> +       /* Account for nodes with cpus and no memory */
> +       node_possible_map = numa_nodes_parsed;
> +       numa_nodemask_from_meminfo(&node_possible_map, mi);
> +       if (WARN_ON(nodes_empty(node_possible_map)))
> +               return -EINVAL;
> +
> +       for (i = 0; i < mi->nr_blks; i++) {
> +               struct numa_memblk *mb = &mi->blk[i];
> +
> +               memblock_set_node(mb->start, mb->end - mb->start,
> +                                 &memblock.memory, mb->nid);
> +       }
> +
> +       /*
> +        * If sections array is gonna be used for pfn -> nid mapping, check
> +        * whether its granularity is fine enough.
> +        */
> +#ifdef NODE_NOT_IN_PAGE_FLAGS
> +       pfn_align = node_map_pfn_alignment();
> +       if (pfn_align && pfn_align < PAGES_PER_SECTION) {
> +               pr_warn("Node alignment %lluMB < min %lluMB, rejecting NUMA config\n",
> +                      PFN_PHYS(pfn_align) >> 20,
> +                      PFN_PHYS(PAGES_PER_SECTION) >> 20);
> +               return -EINVAL;
> +       }
> +#endif
> +       if (!numa_meminfo_cover_memory(mi))
> +               return -EINVAL;
> +
> +       /* Finally register nodes. */
> +       for_each_node_mask(nid, node_possible_map) {
> +               u64 start = PFN_PHYS(max_pfn);
> +               u64 end = 0;
> +
> +               for (i = 0; i < mi->nr_blks; i++) {
> +                       if (nid != mi->blk[i].nid)
> +                               continue;
> +                       start = min(mi->blk[i].start, start);
> +                       end = max(mi->blk[i].end, end);
> +               }
> +
> +               if (start < end)
> +                       setup_node_data(nid, start, end);
> +       }
> +
> +       /* Dump memblock with node info and return. */
> +       memblock_dump_all();
> +       return 0;
> +}
> +
> +static int __init numa_init(int (*init_func)(void))
> +{
> +       int ret, i;
> +
> +       nodes_clear(numa_nodes_parsed);
> +       nodes_clear(node_possible_map);
> +       nodes_clear(node_online_map);
> +
> +       ret = init_func();
> +       if (ret < 0)
> +               return ret;
> +
> +       ret = numa_register_memblks(&numa_meminfo);
> +       if (ret < 0)
> +               return ret;
> +
> +       for (i = 0; i < nr_cpu_ids; i++)
> +               numa_clear_node(i);
> +
> +       setup_node_to_cpumask_map();
> +       build_cpu_to_node_map();
> +       return 0;
> +}
> +
> +/**
> + * dummy_numa_init - Fallback dummy NUMA init
> + *
> + * Used if there's no underlying NUMA architecture, NUMA initialization
> + * fails, or NUMA is disabled on the command line.
> + *
> + * Must online at least one node and add memory blocks that cover all
> + * allowed memory.  This function must not fail.
> + */
> +static int __init dummy_numa_init(void)
> +{
> +       pr_info("%s\n", "No NUMA configuration found");
> +       pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n",
> +              0LLU, PFN_PHYS(max_pfn) - 1);
> +       node_set(0, numa_nodes_parsed);
> +       numa_add_memblk(0, 0, PFN_PHYS(max_pfn));
> +       dummy_numa_enabled = 1;
> +
> +       return 0;
> +}
> +
> +/**
> + * arm64_numa_init - Initialize NUMA
> + *
> + * Try each configured NUMA initialization method until one succeeds.  The
> + * last fallback is dummy single node config encomapssing whole memory and
> + * never fails.
> + */
> +void __init arm64_numa_init(void)
> +{
> +       numa_init(dummy_numa_init);
> +}
> --
> 1.8.1.4
>
can you please review this patch.

thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 1/4] arm64, numa: adding numa support for arm64 platforms.
  2015-09-03  9:52       ` Ganapatrao Kulkarni
@ 2015-09-03 10:13           ` Will Deacon
  -1 siblings, 0 replies; 94+ messages in thread
From: Will Deacon @ 2015-09-03 10:13 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: Ganapatrao Kulkarni, Catalin Marinas,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Leif Lindholm,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A, Al Stone, Arnd Bergmann,
	Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Robert

On Thu, Sep 03, 2015 at 10:52:02AM +0100, Ganapatrao Kulkarni wrote:
> On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
> <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org> wrote:
> > Adding numa support for arm64 based platforms.
> > This patch adds by default the dummy numa node and
> > maps all memory and cpus to node 0.
> > using this patch, numa can be simulated on single node arm64 platforms.
> >
> > Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>

[trimmed ~850 lines of context]

> can you please review this patch.

It's the middle of the merge window, please be patient.

Will
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 1/4] arm64, numa: adding numa support for arm64 platforms.
@ 2015-09-03 10:13           ` Will Deacon
  0 siblings, 0 replies; 94+ messages in thread
From: Will Deacon @ 2015-09-03 10:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 03, 2015 at 10:52:02AM +0100, Ganapatrao Kulkarni wrote:
> On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
> <gkulkarni@caviumnetworks.com> wrote:
> > Adding numa support for arm64 based platforms.
> > This patch adds by default the dummy numa node and
> > maps all memory and cpus to node 0.
> > using this patch, numa can be simulated on single node arm64 platforms.
> >
> > Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>

[trimmed ~850 lines of context]

> can you please review this patch.

It's the middle of the merge window, please be patient.

Will

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-29 14:56                 ` Ganapatrao Kulkarni
@ 2015-09-08 13:27                     ` Hanjun Guo
  -1 siblings, 0 replies; 94+ messages in thread
From: Hanjun Guo @ 2015-09-08 13:27 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, Leizhen (ThunderTown)
  Cc: Rob Herring, Mark Rutland, Benjamin Herrenschmidt,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	al.stone-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg, Will Deacon,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	Pawel Moll, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Ganapatrao Kulkarni

Hi Ganapatrao,

On 08/29/2015 10:56 PM, Ganapatrao Kulkarni wrote:
> Hi Thunder,
>
> On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
> <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:
>>
>>
>> On 2015/8/28 22:02, Rob Herring wrote:
>>> +benh
>>>
>>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
>>>> Hi,
>>>>
>>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>>> DT bindings for numa map for memory, cores and IOs using
>>>>> arm,associativity device node property.
>>>>
>>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>>> point in renaming the properties.
>>>
>>> So just keep the ibm? I'm okay with that. That would help move to
>>> common code. Alternatively, we could drop the vendor prefix and have
>>> common code just check for both.
>>>
>>
>> Hi all,
>>
>> Why not copy the method of ACPI numa? There only three elements should be configured:
>> 1) a cpu belong to which node
>> 2) a memory block belong to which node
>> 3) the distance of each two nodes
>>
>> The devicetree nodes of numa can be like below:
>> / {
>>          ...
>>
>>          numa-nodes-info {
>>                  node-name: node-description {
>>                          mem-ranges = <...>;
>>                          cpus-list = <...>;
>>                  };
>>
>>                  nodes-distance {
>>                          distance-list = <...>;
>>                  };
>>          };
>>
>>          ...
>> };
>>
> some what similar to what your are proposing is already implemented in
> my v2 patchset.
> https://lwn.net/Articles/623920/
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html
> we have went to associativity property based implementation to keep it
> more generic.
> i do have both acpi(using linaro/hanjun's patches) and associativity
> based implementations on our internal tree
> and tested on thunderx platform.

Great thanks!

> i do see issue in creating numa mapping using ACPI for IOs(for
> example, i am not able to create numa mapping for ITS which is on each
> node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
> talks only about processor and memory.

I'm not sure why the ITS needs to know the NUMA domain, for my
understanding, the interrupt will route to the correct NUMA domain
using setting the affinity, ITS will configured to route it to
the right GICR(cpu), so I think the ITS don't need to know which
NUMA node belonging to, correct me if I missed something.

Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-08 13:27                     ` Hanjun Guo
  0 siblings, 0 replies; 94+ messages in thread
From: Hanjun Guo @ 2015-09-08 13:27 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ganapatrao,

On 08/29/2015 10:56 PM, Ganapatrao Kulkarni wrote:
> Hi Thunder,
>
> On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
> <thunder.leizhen@huawei.com> wrote:
>>
>>
>> On 2015/8/28 22:02, Rob Herring wrote:
>>> +benh
>>>
>>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>>>> Hi,
>>>>
>>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>>> DT bindings for numa map for memory, cores and IOs using
>>>>> arm,associativity device node property.
>>>>
>>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>>> point in renaming the properties.
>>>
>>> So just keep the ibm? I'm okay with that. That would help move to
>>> common code. Alternatively, we could drop the vendor prefix and have
>>> common code just check for both.
>>>
>>
>> Hi all,
>>
>> Why not copy the method of ACPI numa? There only three elements should be configured:
>> 1) a cpu belong to which node
>> 2) a memory block belong to which node
>> 3) the distance of each two nodes
>>
>> The devicetree nodes of numa can be like below:
>> / {
>>          ...
>>
>>          numa-nodes-info {
>>                  node-name: node-description {
>>                          mem-ranges = <...>;
>>                          cpus-list = <...>;
>>                  };
>>
>>                  nodes-distance {
>>                          distance-list = <...>;
>>                  };
>>          };
>>
>>          ...
>> };
>>
> some what similar to what your are proposing is already implemented in
> my v2 patchset.
> https://lwn.net/Articles/623920/
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html
> we have went to associativity property based implementation to keep it
> more generic.
> i do have both acpi(using linaro/hanjun's patches) and associativity
> based implementations on our internal tree
> and tested on thunderx platform.

Great thanks!

> i do see issue in creating numa mapping using ACPI for IOs(for
> example, i am not able to create numa mapping for ITS which is on each
> node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
> talks only about processor and memory.

I'm not sure why the ITS needs to know the NUMA domain, for my
understanding, the interrupt will route to the correct NUMA domain
using setting the affinity, ITS will configured to route it to
the right GICR(cpu), so I think the ITS don't need to know which
NUMA node belonging to, correct me if I missed something.

Thanks
Hanjun

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-09-08 13:27                     ` Hanjun Guo
@ 2015-09-08 16:27                         ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-08 16:27 UTC (permalink / raw)
  To: Hanjun Guo
  Cc: Leizhen (ThunderTown),
	Rob Herring, Mark Rutland, Benjamin Herrenschmidt,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	al.stone-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg, Will Deacon,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	Pawel Moll, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	galak-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org

Hi Hanjun,

On Tue, Sep 8, 2015 at 6:57 PM, Hanjun Guo <hanjun.guo-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> Hi Ganapatrao,
>
>
> On 08/29/2015 10:56 PM, Ganapatrao Kulkarni wrote:
>>
>> Hi Thunder,
>>
>> On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
>> <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:
>>>
>>>
>>>
>>> On 2015/8/28 22:02, Rob Herring wrote:
>>>>
>>>> +benh
>>>>
>>>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>>>>
>>>>>> DT bindings for numa map for memory, cores and IOs using
>>>>>> arm,associativity device node property.
>>>>>
>>>>>
>>>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>>>> point in renaming the properties.
>>>>
>>>>
>>>> So just keep the ibm? I'm okay with that. That would help move to
>>>> common code. Alternatively, we could drop the vendor prefix and have
>>>> common code just check for both.
>>>>
>>>
>>> Hi all,
>>>
>>> Why not copy the method of ACPI numa? There only three elements should be
>>> configured:
>>> 1) a cpu belong to which node
>>> 2) a memory block belong to which node
>>> 3) the distance of each two nodes
>>>
>>> The devicetree nodes of numa can be like below:
>>> / {
>>>          ...
>>>
>>>          numa-nodes-info {
>>>                  node-name: node-description {
>>>                          mem-ranges = <...>;
>>>                          cpus-list = <...>;
>>>                  };
>>>
>>>                  nodes-distance {
>>>                          distance-list = <...>;
>>>                  };
>>>          };
>>>
>>>          ...
>>> };
>>>
>> some what similar to what your are proposing is already implemented in
>> my v2 patchset.
>> https://lwn.net/Articles/623920/
>>
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html
>> we have went to associativity property based implementation to keep it
>> more generic.
>> i do have both acpi(using linaro/hanjun's patches) and associativity
>> based implementations on our internal tree
>> and tested on thunderx platform.
>
>
> Great thanks!
>
>> i do see issue in creating numa mapping using ACPI for IOs(for
>> example, i am not able to create numa mapping for ITS which is on each
>> node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
>> talks only about processor and memory.
>
>
> I'm not sure why the ITS needs to know the NUMA domain, for my
> understanding, the interrupt will route to the correct NUMA domain
> using setting the affinity, ITS will configured to route it to
> the right GICR(cpu), so I think the ITS don't need to know which
> NUMA node belonging to, correct me if I missed something.
IIUC, GICR/collection is per cpu and can be mapped to numa node using
cpu to node mapping.
However there are multiple its in multi-socket platform(at-least one
its per socket),
knowing its to numa node mapping will help in routing(optimal) the
interrupts to  any one of GICR/collections of that node
Hence, we need to find which its belongs to which socket/node using dt.
same applies to pci bus too.
>
> Thanks
> Hanjun

thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-08 16:27                         ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-08 16:27 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Hanjun,

On Tue, Sep 8, 2015 at 6:57 PM, Hanjun Guo <hanjun.guo@linaro.org> wrote:
> Hi Ganapatrao,
>
>
> On 08/29/2015 10:56 PM, Ganapatrao Kulkarni wrote:
>>
>> Hi Thunder,
>>
>> On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
>> <thunder.leizhen@huawei.com> wrote:
>>>
>>>
>>>
>>> On 2015/8/28 22:02, Rob Herring wrote:
>>>>
>>>> +benh
>>>>
>>>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland@arm.com>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>>>>
>>>>>> DT bindings for numa map for memory, cores and IOs using
>>>>>> arm,associativity device node property.
>>>>>
>>>>>
>>>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>>>> point in renaming the properties.
>>>>
>>>>
>>>> So just keep the ibm? I'm okay with that. That would help move to
>>>> common code. Alternatively, we could drop the vendor prefix and have
>>>> common code just check for both.
>>>>
>>>
>>> Hi all,
>>>
>>> Why not copy the method of ACPI numa? There only three elements should be
>>> configured:
>>> 1) a cpu belong to which node
>>> 2) a memory block belong to which node
>>> 3) the distance of each two nodes
>>>
>>> The devicetree nodes of numa can be like below:
>>> / {
>>>          ...
>>>
>>>          numa-nodes-info {
>>>                  node-name: node-description {
>>>                          mem-ranges = <...>;
>>>                          cpus-list = <...>;
>>>                  };
>>>
>>>                  nodes-distance {
>>>                          distance-list = <...>;
>>>                  };
>>>          };
>>>
>>>          ...
>>> };
>>>
>> some what similar to what your are proposing is already implemented in
>> my v2 patchset.
>> https://lwn.net/Articles/623920/
>>
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html
>> we have went to associativity property based implementation to keep it
>> more generic.
>> i do have both acpi(using linaro/hanjun's patches) and associativity
>> based implementations on our internal tree
>> and tested on thunderx platform.
>
>
> Great thanks!
>
>> i do see issue in creating numa mapping using ACPI for IOs(for
>> example, i am not able to create numa mapping for ITS which is on each
>> node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
>> talks only about processor and memory.
>
>
> I'm not sure why the ITS needs to know the NUMA domain, for my
> understanding, the interrupt will route to the correct NUMA domain
> using setting the affinity, ITS will configured to route it to
> the right GICR(cpu), so I think the ITS don't need to know which
> NUMA node belonging to, correct me if I missed something.
IIUC, GICR/collection is per cpu and can be mapped to numa node using
cpu to node mapping.
However there are multiple its in multi-socket platform(at-least one
its per socket),
knowing its to numa node mapping will help in routing(optimal) the
interrupts to  any one of GICR/collections of that node
Hence, we need to find which its belongs to which socket/node using dt.
same applies to pci bus too.
>
> Thanks
> Hanjun

thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-09-08 16:27                         ` Ganapatrao Kulkarni
@ 2015-09-11  3:53                             ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-11  3:53 UTC (permalink / raw)
  To: Hanjun Guo
  Cc: Leizhen (ThunderTown),
	Rob Herring, Mark Rutland, Benjamin Herrenschmidt,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	al.stone-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg, Will Deacon,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	Pawel Moll, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	galak-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org

Hi Thunder,


On Tue, Sep 8, 2015 at 9:57 PM, Ganapatrao Kulkarni
<gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hi Hanjun,
>
> On Tue, Sep 8, 2015 at 6:57 PM, Hanjun Guo <hanjun.guo-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
>> Hi Ganapatrao,
>>
>>
>> On 08/29/2015 10:56 PM, Ganapatrao Kulkarni wrote:
>>>
>>> Hi Thunder,
>>>
>>> On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
>>> <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:
>>>>
>>>>
>>>>
>>>> On 2015/8/28 22:02, Rob Herring wrote:
>>>>>
>>>>> +benh
>>>>>
>>>>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>>>>>
>>>>>>> DT bindings for numa map for memory, cores and IOs using
>>>>>>> arm,associativity device node property.
>>>>>>
>>>>>>
>>>>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>>>>> point in renaming the properties.
>>>>>
>>>>>
>>>>> So just keep the ibm? I'm okay with that. That would help move to
>>>>> common code. Alternatively, we could drop the vendor prefix and have
>>>>> common code just check for both.
>>>>>
>>>>
>>>> Hi all,
>>>>
>>>> Why not copy the method of ACPI numa? There only three elements should be
>>>> configured:
>>>> 1) a cpu belong to which node
>>>> 2) a memory block belong to which node
>>>> 3) the distance of each two nodes
I too thought acpi only defines mapping for cpu and memory to numa
nodes and no specification to define for IOs.
however after going through the x86 implementation, i can see there is
provision for mapping IOs to numa node in acpi.
in x86 code, function pci_acpi_scan_root calls acpi_get_node to get
associated node for pci bus using _PXM object.
it imply there is entry in acpi tables to map pci bus for numa
node(proximity domain).
so in dt also, we should  have binding to define cpu, memory and IOs
to node mapping.
>>>>
>>>> The devicetree nodes of numa can be like below:
>>>> / {
>>>>          ...
>>>>
>>>>          numa-nodes-info {
>>>>                  node-name: node-description {
>>>>                          mem-ranges = <...>;
>>>>                          cpus-list = <...>;
>>>>                  };
>>>>
>>>>                  nodes-distance {
>>>>                          distance-list = <...>;
>>>>                  };
>>>>          };
>>>>
>>>>          ...
>>>> };
>>>>
>>> some what similar to what your are proposing is already implemented in
>>> my v2 patchset.
>>> https://lwn.net/Articles/623920/
>>>
>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html
>>> we have went to associativity property based implementation to keep it
>>> more generic.
>>> i do have both acpi(using linaro/hanjun's patches) and associativity
>>> based implementations on our internal tree
>>> and tested on thunderx platform.
>>
>>
>> Great thanks!
>>
>>> i do see issue in creating numa mapping using ACPI for IOs(for
>>> example, i am not able to create numa mapping for ITS which is on each
>>> node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
>>> talks only about processor and memory.
>>
>>
>> I'm not sure why the ITS needs to know the NUMA domain, for my
>> understanding, the interrupt will route to the correct NUMA domain
>> using setting the affinity, ITS will configured to route it to
>> the right GICR(cpu), so I think the ITS don't need to know which
>> NUMA node belonging to, correct me if I missed something.
> IIUC, GICR/collection is per cpu and can be mapped to numa node using
> cpu to node mapping.
> However there are multiple its in multi-socket platform(at-least one
> its per socket),
> knowing its to numa node mapping will help in routing(optimal) the
> interrupts to  any one of GICR/collections of that node
> Hence, we need to find which its belongs to which socket/node using dt.
> same applies to pci bus too.
>>
>> Thanks
>> Hanjun
>
> thanks
> Ganapat
thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-11  3:53                             ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-11  3:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thunder,


On Tue, Sep 8, 2015 at 9:57 PM, Ganapatrao Kulkarni
<gpkulkarni@gmail.com> wrote:
> Hi Hanjun,
>
> On Tue, Sep 8, 2015 at 6:57 PM, Hanjun Guo <hanjun.guo@linaro.org> wrote:
>> Hi Ganapatrao,
>>
>>
>> On 08/29/2015 10:56 PM, Ganapatrao Kulkarni wrote:
>>>
>>> Hi Thunder,
>>>
>>> On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
>>> <thunder.leizhen@huawei.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2015/8/28 22:02, Rob Herring wrote:
>>>>>
>>>>> +benh
>>>>>
>>>>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland@arm.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>>>>>
>>>>>>> DT bindings for numa map for memory, cores and IOs using
>>>>>>> arm,associativity device node property.
>>>>>>
>>>>>>
>>>>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>>>>> point in renaming the properties.
>>>>>
>>>>>
>>>>> So just keep the ibm? I'm okay with that. That would help move to
>>>>> common code. Alternatively, we could drop the vendor prefix and have
>>>>> common code just check for both.
>>>>>
>>>>
>>>> Hi all,
>>>>
>>>> Why not copy the method of ACPI numa? There only three elements should be
>>>> configured:
>>>> 1) a cpu belong to which node
>>>> 2) a memory block belong to which node
>>>> 3) the distance of each two nodes
I too thought acpi only defines mapping for cpu and memory to numa
nodes and no specification to define for IOs.
however after going through the x86 implementation, i can see there is
provision for mapping IOs to numa node in acpi.
in x86 code, function pci_acpi_scan_root calls acpi_get_node to get
associated node for pci bus using _PXM object.
it imply there is entry in acpi tables to map pci bus for numa
node(proximity domain).
so in dt also, we should  have binding to define cpu, memory and IOs
to node mapping.
>>>>
>>>> The devicetree nodes of numa can be like below:
>>>> / {
>>>>          ...
>>>>
>>>>          numa-nodes-info {
>>>>                  node-name: node-description {
>>>>                          mem-ranges = <...>;
>>>>                          cpus-list = <...>;
>>>>                  };
>>>>
>>>>                  nodes-distance {
>>>>                          distance-list = <...>;
>>>>                  };
>>>>          };
>>>>
>>>>          ...
>>>> };
>>>>
>>> some what similar to what your are proposing is already implemented in
>>> my v2 patchset.
>>> https://lwn.net/Articles/623920/
>>>
>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html
>>> we have went to associativity property based implementation to keep it
>>> more generic.
>>> i do have both acpi(using linaro/hanjun's patches) and associativity
>>> based implementations on our internal tree
>>> and tested on thunderx platform.
>>
>>
>> Great thanks!
>>
>>> i do see issue in creating numa mapping using ACPI for IOs(for
>>> example, i am not able to create numa mapping for ITS which is on each
>>> node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
>>> talks only about processor and memory.
>>
>>
>> I'm not sure why the ITS needs to know the NUMA domain, for my
>> understanding, the interrupt will route to the correct NUMA domain
>> using setting the affinity, ITS will configured to route it to
>> the right GICR(cpu), so I think the ITS don't need to know which
>> NUMA node belonging to, correct me if I missed something.
> IIUC, GICR/collection is per cpu and can be mapped to numa node using
> cpu to node mapping.
> However there are multiple its in multi-socket platform(at-least one
> its per socket),
> knowing its to numa node mapping will help in routing(optimal) the
> interrupts to  any one of GICR/collections of that node
> Hence, we need to find which its belongs to which socket/node using dt.
> same applies to pci bus too.
>>
>> Thanks
>> Hanjun
>
> thanks
> Ganapat
thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-09-11  3:53                             ` Ganapatrao Kulkarni
@ 2015-09-11  6:43                                 ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 94+ messages in thread
From: Leizhen (ThunderTown) @ 2015-09-11  6:43 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, Hanjun Guo
  Cc: Rob Herring, Mark Rutland, Benjamin Herrenschmidt,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	al.stone-QSEj5FYQhm4dnm+yROfE0A,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A, Catalin Marinas,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg, Will Deacon,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	Pawel Moll, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, Ganapatrao Kulkarni



On 2015/9/11 11:53, Ganapatrao Kulkarni wrote:
> Hi Thunder,
> 
> 
> On Tue, Sep 8, 2015 at 9:57 PM, Ganapatrao Kulkarni
> <gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> Hi Hanjun,
>>
>> On Tue, Sep 8, 2015 at 6:57 PM, Hanjun Guo <hanjun.guo-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
>>> Hi Ganapatrao,
>>>
>>>
>>> On 08/29/2015 10:56 PM, Ganapatrao Kulkarni wrote:
>>>>
>>>> Hi Thunder,
>>>>
>>>> On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
>>>> <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2015/8/28 22:02, Rob Herring wrote:
>>>>>>
>>>>>> +benh
>>>>>>
>>>>>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>>>>>>
>>>>>>>> DT bindings for numa map for memory, cores and IOs using
>>>>>>>> arm,associativity device node property.
>>>>>>>
>>>>>>>
>>>>>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>>>>>> point in renaming the properties.
>>>>>>
>>>>>>
>>>>>> So just keep the ibm? I'm okay with that. That would help move to
>>>>>> common code. Alternatively, we could drop the vendor prefix and have
>>>>>> common code just check for both.
>>>>>>
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Why not copy the method of ACPI numa? There only three elements should be
>>>>> configured:
>>>>> 1) a cpu belong to which node
>>>>> 2) a memory block belong to which node
>>>>> 3) the distance of each two nodes
> I too thought acpi only defines mapping for cpu and memory to numa
> nodes and no specification to define for IOs.
> however after going through the x86 implementation, i can see there is
> provision for mapping IOs to numa node in acpi.
> in x86 code, function pci_acpi_scan_root calls acpi_get_node to get
> associated node for pci bus using _PXM object.
> it imply there is entry in acpi tables to map pci bus for numa
> node(proximity domain).
> so in dt also, we should  have binding to define cpu, memory and IOs
> to node mapping.

Yes, we should implement of_node_to_nid, to support device driver use dev_to_node.

I have added description about it, in the reply to Benjamin Herrenschmidt, at 2015/8/31 9:46.


>>>>>
>>>>> The devicetree nodes of numa can be like below:
>>>>> / {
>>>>>          ...
>>>>>
>>>>>          numa-nodes-info {
>>>>>                  node-name: node-description {
>>>>>                          mem-ranges = <...>;
>>>>>                          cpus-list = <...>;
>>>>>                  };
>>>>>
>>>>>                  nodes-distance {
>>>>>                          distance-list = <...>;
>>>>>                  };
>>>>>          };
>>>>>
>>>>>          ...
>>>>> };
>>>>>
>>>> some what similar to what your are proposing is already implemented in
>>>> my v2 patchset.
>>>> https://lwn.net/Articles/623920/
>>>>
>>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html
>>>> we have went to associativity property based implementation to keep it
>>>> more generic.
>>>> i do have both acpi(using linaro/hanjun's patches) and associativity
>>>> based implementations on our internal tree
>>>> and tested on thunderx platform.
>>>
>>>
>>> Great thanks!
>>>
>>>> i do see issue in creating numa mapping using ACPI for IOs(for
>>>> example, i am not able to create numa mapping for ITS which is on each
>>>> node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
>>>> talks only about processor and memory.
>>>
>>>
>>> I'm not sure why the ITS needs to know the NUMA domain, for my
>>> understanding, the interrupt will route to the correct NUMA domain
>>> using setting the affinity, ITS will configured to route it to
>>> the right GICR(cpu), so I think the ITS don't need to know which
>>> NUMA node belonging to, correct me if I missed something.
>> IIUC, GICR/collection is per cpu and can be mapped to numa node using
>> cpu to node mapping.
>> However there are multiple its in multi-socket platform(at-least one
>> its per socket),
>> knowing its to numa node mapping will help in routing(optimal) the
>> interrupts to  any one of GICR/collections of that node
>> Hence, we need to find which its belongs to which socket/node using dt.
>> same applies to pci bus too.
>>>
>>> Thanks
>>> Hanjun
>>
>> thanks
>> Ganapat
> thanks
> Ganapat
> 
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-11  6:43                                 ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 94+ messages in thread
From: Leizhen (ThunderTown) @ 2015-09-11  6:43 UTC (permalink / raw)
  To: linux-arm-kernel



On 2015/9/11 11:53, Ganapatrao Kulkarni wrote:
> Hi Thunder,
> 
> 
> On Tue, Sep 8, 2015 at 9:57 PM, Ganapatrao Kulkarni
> <gpkulkarni@gmail.com> wrote:
>> Hi Hanjun,
>>
>> On Tue, Sep 8, 2015 at 6:57 PM, Hanjun Guo <hanjun.guo@linaro.org> wrote:
>>> Hi Ganapatrao,
>>>
>>>
>>> On 08/29/2015 10:56 PM, Ganapatrao Kulkarni wrote:
>>>>
>>>> Hi Thunder,
>>>>
>>>> On Sat, Aug 29, 2015 at 3:16 PM, Leizhen (ThunderTown)
>>>> <thunder.leizhen@huawei.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2015/8/28 22:02, Rob Herring wrote:
>>>>>>
>>>>>> +benh
>>>>>>
>>>>>> On Fri, Aug 28, 2015 at 7:32 AM, Mark Rutland <mark.rutland@arm.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>>>>>>>>
>>>>>>>> DT bindings for numa map for memory, cores and IOs using
>>>>>>>> arm,associativity device node property.
>>>>>>>
>>>>>>>
>>>>>>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>>>>>>> point in renaming the properties.
>>>>>>
>>>>>>
>>>>>> So just keep the ibm? I'm okay with that. That would help move to
>>>>>> common code. Alternatively, we could drop the vendor prefix and have
>>>>>> common code just check for both.
>>>>>>
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Why not copy the method of ACPI numa? There only three elements should be
>>>>> configured:
>>>>> 1) a cpu belong to which node
>>>>> 2) a memory block belong to which node
>>>>> 3) the distance of each two nodes
> I too thought acpi only defines mapping for cpu and memory to numa
> nodes and no specification to define for IOs.
> however after going through the x86 implementation, i can see there is
> provision for mapping IOs to numa node in acpi.
> in x86 code, function pci_acpi_scan_root calls acpi_get_node to get
> associated node for pci bus using _PXM object.
> it imply there is entry in acpi tables to map pci bus for numa
> node(proximity domain).
> so in dt also, we should  have binding to define cpu, memory and IOs
> to node mapping.

Yes, we should implement of_node_to_nid, to support device driver use dev_to_node.

I have added description about it, in the reply to Benjamin Herrenschmidt, at 2015/8/31 9:46.


>>>>>
>>>>> The devicetree nodes of numa can be like below:
>>>>> / {
>>>>>          ...
>>>>>
>>>>>          numa-nodes-info {
>>>>>                  node-name: node-description {
>>>>>                          mem-ranges = <...>;
>>>>>                          cpus-list = <...>;
>>>>>                  };
>>>>>
>>>>>                  nodes-distance {
>>>>>                          distance-list = <...>;
>>>>>                  };
>>>>>          };
>>>>>
>>>>>          ...
>>>>> };
>>>>>
>>>> some what similar to what your are proposing is already implemented in
>>>> my v2 patchset.
>>>> https://lwn.net/Articles/623920/
>>>>
>>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/305164.html
>>>> we have went to associativity property based implementation to keep it
>>>> more generic.
>>>> i do have both acpi(using linaro/hanjun's patches) and associativity
>>>> based implementations on our internal tree
>>>> and tested on thunderx platform.
>>>
>>>
>>> Great thanks!
>>>
>>>> i do see issue in creating numa mapping using ACPI for IOs(for
>>>> example, i am not able to create numa mapping for ITS which is on each
>>>> node, using ACPI tables),  since ACPI spec (tables SRAT and SLIT)
>>>> talks only about processor and memory.
>>>
>>>
>>> I'm not sure why the ITS needs to know the NUMA domain, for my
>>> understanding, the interrupt will route to the correct NUMA domain
>>> using setting the affinity, ITS will configured to route it to
>>> the right GICR(cpu), so I think the ITS don't need to know which
>>> NUMA node belonging to, correct me if I missed something.
>> IIUC, GICR/collection is per cpu and can be mapped to numa node using
>> cpu to node mapping.
>> However there are multiple its in multi-socket platform(at-least one
>> its per socket),
>> knowing its to numa node mapping will help in routing(optimal) the
>> interrupts to  any one of GICR/collections of that node
>> Hence, we need to find which its belongs to which socket/node using dt.
>> same applies to pci bus too.
>>>
>>> Thanks
>>> Hanjun
>>
>> thanks
>> Ganapat
> thanks
> Ganapat
> 
> .
> 

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-08-28 12:32       ` Mark Rutland
  (?)
  (?)
@ 2015-09-29  8:35       ` Ganapatrao Kulkarni
       [not found]         ` <CAFpQJXWzM644KsFWP9ei-k6gWgNVpBVT+UbY7NYdyfmyL=zMkw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  -1 siblings, 1 reply; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-29  8:35 UTC (permalink / raw)
  To: Mark Rutland, Benjamin Herrenschmidt, Prasun Kapoor, Rob Herring,
	Leizhen (ThunderTown)
  Cc: Ganapatrao Kulkarni, linux-arm-kernel, devicetree, Will Deacon,
	Catalin Marinas, grant.likely, leif.lindholm, rfranz,
	ard.biesheuvel, msalter, robh+dt, steve.capper, hanjun.guo,
	al.stone, arnd, Pawel Moll, ijc+devicetree, galak, Marc Zyngier

[-- Attachment #1: Type: text/plain, Size: 8717 bytes --]

Hi Mark,

I have tried to answer your comments, in the meantime we are waiting for
Ben to share the details.

On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland@arm.com> wrote:

> Hi,
>
> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
> > DT bindings for numa map for memory, cores and IOs using
> > arm,associativity device node property.
>
> Given this is just a copy of ibm,associativity, I'm not sure I see much
> point in renaming the properties.
>
> However, (somewhat counter to that) I'm also concerned that this isn't
> sufficient for systems we're beginning to see today (more on that
> below), so I don't think a simple copy of ibm,associativity is good
> enough.
>
it is just copy right now, however it can evolve when we come across more
arm64 numa platforms

>
> >
> > Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
> > ---
> >  Documentation/devicetree/bindings/arm/numa.txt | 212
> +++++++++++++++++++++++++
> >  1 file changed, 212 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
> >
> > diff --git a/Documentation/devicetree/bindings/arm/numa.txt
> b/Documentation/devicetree/bindings/arm/numa.txt
> > new file mode 100644
> > index 0000000..dc3ef86
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/arm/numa.txt
> > @@ -0,0 +1,212 @@
> >
> +==============================================================================
> > +NUMA binding description.
> >
> +==============================================================================
> > +
> >
> +==============================================================================
> > +1 - Introduction
> >
> +==============================================================================
> > +
> > +Systems employing a Non Uniform Memory Access (NUMA) architecture
> contain
> > +collections of hardware resources including processors, memory, and I/O
> buses,
> > +that comprise what is commonly known as a NUMA node.
> > +Processor accesses to memory within the local NUMA node is generally
> faster
> > +than processor accesses to memory outside of the local NUMA node.
> > +DT defines interfaces that allow the platform to convey NUMA node
> > +topology information to OS.
> > +
> >
> +==============================================================================
> > +2 - arm,associativity
> >
> +==============================================================================
> > +The mapping is done using arm,associativity device property.
> > +this property needs to be present in every device node which needs to
> to be
> > +mapped to numa nodes.
>
> Can't there be some inheritance? e.g. all devices on a bus with an
> arm,associativity property being assumed to share that value?
>
yes there is inheritance and respective bus drivers should take care of it,
like pci driver does at present.

>
> > +
> > +arm,associativity property is set of 32-bit integers which defines
> level of
>
> s/set/list/ -- the order is important.
>
ok

>
> > +topology and boundary in the system at which a significant difference in
> > +performance can be measured between cross-device accesses within
> > +a single location and those spanning multiple locations.
> > +The first cell always contains the broadest subdivision within the
> system,
> > +while the last cell enumerates the individual devices, such as an SMT
> thread
> > +of a CPU, or a bus bridge within an SoC".
>
> While this gives us some hierarchy, this doesn't seem to encode relative
> distances at all. That seems like an oversight.
>

distance is computed, will add the details to document.
local nodes will have distance as 10(LOCAL_DISTANCE) and every level, the
distance multiplies by 2.
for example, for level 1 numa topology, distance from local node to remote
node will be 20.


>
> Additionally, I'm somewhat unclear on how what you'd be expected to
> provide for this property in cases like ring or mesh interconnects,
> where there isn't a strict hierarchy (see systems with ARM's own CCN, or
> Tilera's TILE-Mx), but there is some measure of closeness.
>

IIUC, as per ARMs CCN architecture, all core/clusters are at equal distance
of DDR, i dont see any NUMA topology.
however, if there are 2 SoC connected thorough the CCN, then it is very
much similar to cavium topology.

Must all of these have the same length? If so, why not have a
> #(whatever)-cells property in the root to describe the expected length?
> If not, how are they to be interpreted relative to each other?
>

yes, all are of default size.
IMHO, there is no need to add cells property.

>
> > +
> > +ex:
>
> s/ex/Example:/, please. There's no need to contract that.
>
> > +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
> > +       arm,associativity = <0 0 0 0 0>;
> > +
> >
> +==============================================================================
> > +3 - arm,associativity-reference-points
> >
> +==============================================================================
> > +This property is a set of 32-bit integers, each representing an index
> into
>
> Likeise, s/set/list/
>
ok

>
> > +the arm,associativity nodes. The first integer is the most significant
> > +NUMA boundary and the following are progressively less significant
> boundaries.
> > +There can be more than one level of NUMA.
>
> I'm not clear on why this is necessary; the arm,associativity property
> is already ordered from most significant to least significant per its
> description.
>

first entry in arm,associativity-reference-points is used to find which
entry in associativity defines node id.
also entries in arm,associativity-reference-points defines,
how many entries(depth) in associativity can be used to calculate node
distance
in both level 1 and  multi level(hierarchical) numa topology.


>
> What does this property achieve?
>
> The description also doesn't describe where this property is expected to
> live. The example isn't sufficient to disambiguate that, especially as
> it seems like a trivial case.
>
sure, will add one more example to describe the
arm,associativity-reference-points

>
> Is this only expected at the root of the tree? Can it be re-defined in
> sub-nodes?
>
yes it is defined only at the root.

>
> > +
> > +Ex:
>
> s/Ex/Example:/, please
>
sure.

>
> > +       arm,associativity-reference-points = <0 1>;
> > +       The board Id(index 0) used first to calculate the associativity
> (node
> > +       distance), then follows the  socket id(index 1).
> > +
> > +       arm,associativity-reference-points = <1 0>;
> > +       The socket Id(index 1) used first to calculate the associativity,
> > +       then follows the board id(index 0).
> > +
> > +       arm,associativity-reference-points = <0>;
> > +       Only the board Id(index 0) used to calculate the associativity.
> > +
> > +       arm,associativity-reference-points = <1>;
> > +       Only socket Id(index 1) used to calculate the associativity.
> > +
> >
> +==============================================================================
> > +4 - Example dts
> >
> +==============================================================================
> > +
> > +Example: 2 Node system consists of 2 boards and each board having one
> socket
> > +and 8 core in each socket.
> > +
> > +       arm,associativity-reference-points = <0>;
> > +
> > +       memory@00c00000 {
> > +               device_type = "memory";
> > +               reg = <0x0 0x00c00000 0x0 0x80000000>;
> > +               /* board 0, socket 0, no specific core */
> > +               arm,associativity = <0 0 0xffff>;
> > +       };
> > +
> > +       memory@10000000000 {
> > +               device_type = "memory";
> > +               reg = <0x100 0x00000000 0x0 0x80000000>;
> > +               /* board 1, socket 0, no specific core */
> > +               arm,associativity = <1 0 0xffff>;
> > +       };
> > +
> > +       cpus {
> > +               #address-cells = <2>;
> > +               #size-cells = <0>;
> > +
> > +               cpu@000 {
> > +                       device_type = "cpu";
> > +                       compatible =  "arm,armv8";
> > +                       reg = <0x0 0x000>;
> > +                       enable-method = "psci";
> > +                       /* board 0, socket 0, core 0*/
> > +                       arm,associativity = <0 0 0>;
>
> We should specify w.r.t. memory and CPUs how the property is expected to
> be used (e.g. in the CPU nodes rather than the cpu-map, with separate
> memory nodes, etc). The generic description of arm,associativity isn't
> sufficient to limit confusion there.
>
ok, will add the details like which nodes can use this property.


>
> Thanks,
> Mark.
>

thanks
Ganapat

[-- Attachment #2: Type: text/html, Size: 12615 bytes --]

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-09-29  8:35       ` Ganapatrao Kulkarni
@ 2015-09-29  8:38             ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-29  8:38 UTC (permalink / raw)
  To: Mark Rutland, Benjamin Herrenschmidt, Prasun Kapoor, Rob Herring,
	Leizhen (ThunderTown)
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4, Pawel Moll,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	galak-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org

(sending again, by mistake it was set to html mode)

On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
<gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hi Mark,
>
> I have tried to answer your comments, in the meantime we are waiting for Ben
> to share the details.
>
> On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
>>
>> Hi,
>>
>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>> > DT bindings for numa map for memory, cores and IOs using
>> > arm,associativity device node property.
>>
>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>> point in renaming the properties.
>>
>> However, (somewhat counter to that) I'm also concerned that this isn't
>> sufficient for systems we're beginning to see today (more on that
>> below), so I don't think a simple copy of ibm,associativity is good
>> enough.
>
> it is just copy right now, however it can evolve when we come across more
> arm64 numa platforms
>>
>>
>> >
>> > Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
>> > ---
>> >  Documentation/devicetree/bindings/arm/numa.txt | 212
>> > +++++++++++++++++++++++++
>> >  1 file changed, 212 insertions(+)
>> >  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>> >
>> > diff --git a/Documentation/devicetree/bindings/arm/numa.txt
>> > b/Documentation/devicetree/bindings/arm/numa.txt
>> > new file mode 100644
>> > index 0000000..dc3ef86
>> > --- /dev/null
>> > +++ b/Documentation/devicetree/bindings/arm/numa.txt
>> > @@ -0,0 +1,212 @@
>> >
>> > +==============================================================================
>> > +NUMA binding description.
>> >
>> > +==============================================================================
>> > +
>> >
>> > +==============================================================================
>> > +1 - Introduction
>> >
>> > +==============================================================================
>> > +
>> > +Systems employing a Non Uniform Memory Access (NUMA) architecture
>> > contain
>> > +collections of hardware resources including processors, memory, and I/O
>> > buses,
>> > +that comprise what is commonly known as a NUMA node.
>> > +Processor accesses to memory within the local NUMA node is generally
>> > faster
>> > +than processor accesses to memory outside of the local NUMA node.
>> > +DT defines interfaces that allow the platform to convey NUMA node
>> > +topology information to OS.
>> > +
>> >
>> > +==============================================================================
>> > +2 - arm,associativity
>> >
>> > +==============================================================================
>> > +The mapping is done using arm,associativity device property.
>> > +this property needs to be present in every device node which needs to
>> > to be
>> > +mapped to numa nodes.
>>
>> Can't there be some inheritance? e.g. all devices on a bus with an
>> arm,associativity property being assumed to share that value?
>
> yes there is inheritance and respective bus drivers should take care of it,
> like pci driver does at present.
>>
>>
>> > +
>> > +arm,associativity property is set of 32-bit integers which defines
>> > level of
>>
>> s/set/list/ -- the order is important.
>
> ok
>>
>>
>> > +topology and boundary in the system at which a significant difference
>> > in
>> > +performance can be measured between cross-device accesses within
>> > +a single location and those spanning multiple locations.
>> > +The first cell always contains the broadest subdivision within the
>> > system,
>> > +while the last cell enumerates the individual devices, such as an SMT
>> > thread
>> > +of a CPU, or a bus bridge within an SoC".
>>
>> While this gives us some hierarchy, this doesn't seem to encode relative
>> distances at all. That seems like an oversight.
>
>
> distance is computed, will add the details to document.
> local nodes will have distance as 10(LOCAL_DISTANCE) and every level, the
> distance multiplies by 2.
> for example, for level 1 numa topology, distance from local node to remote
> node will be 20.
>
>>
>>
>> Additionally, I'm somewhat unclear on how what you'd be expected to
>> provide for this property in cases like ring or mesh interconnects,
>> where there isn't a strict hierarchy (see systems with ARM's own CCN, or
>> Tilera's TILE-Mx), but there is some measure of closeness.
>
>
> IIUC, as per ARMs CCN architecture, all core/clusters are at equal distance
> of DDR, i dont see any NUMA topology.
> however, if there are 2 SoC connected thorough the CCN, then it is very much
> similar to cavium topology.
>
>> Must all of these have the same length? If so, why not have a
>> #(whatever)-cells property in the root to describe the expected length?
>> If not, how are they to be interpreted relative to each other?
>
>
> yes, all are of default size.
> IMHO, there is no need to add cells property.
>>
>>
>> > +
>> > +ex:
>>
>> s/ex/Example:/, please. There's no need to contract that.
>>
>> > +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
>> > +       arm,associativity = <0 0 0 0 0>;
>> > +
>> >
>> > +==============================================================================
>> > +3 - arm,associativity-reference-points
>> >
>> > +==============================================================================
>> > +This property is a set of 32-bit integers, each representing an index
>> > into
>>
>> Likeise, s/set/list/
>
> ok
>>
>>
>> > +the arm,associativity nodes. The first integer is the most significant
>> > +NUMA boundary and the following are progressively less significant
>> > boundaries.
>> > +There can be more than one level of NUMA.
>>
>> I'm not clear on why this is necessary; the arm,associativity property
>> is already ordered from most significant to least significant per its
>> description.
>
>
> first entry in arm,associativity-reference-points is used to find which
> entry in associativity defines node id.
> also entries in arm,associativity-reference-points defines,
> how many entries(depth) in associativity can be used to calculate node
> distance
> in both level 1 and  multi level(hierarchical) numa topology.
>
>>
>>
>> What does this property achieve?
>>
>> The description also doesn't describe where this property is expected to
>> live. The example isn't sufficient to disambiguate that, especially as
>> it seems like a trivial case.
>
> sure, will add one more example to describe the
> arm,associativity-reference-points
>>
>>
>> Is this only expected at the root of the tree? Can it be re-defined in
>> sub-nodes?
>
> yes it is defined only at the root.
>>
>>
>> > +
>> > +Ex:
>>
>> s/Ex/Example:/, please
>
> sure.
>>
>>
>> > +       arm,associativity-reference-points = <0 1>;
>> > +       The board Id(index 0) used first to calculate the associativity
>> > (node
>> > +       distance), then follows the  socket id(index 1).
>> > +
>> > +       arm,associativity-reference-points = <1 0>;
>> > +       The socket Id(index 1) used first to calculate the
>> > associativity,
>> > +       then follows the board id(index 0).
>> > +
>> > +       arm,associativity-reference-points = <0>;
>> > +       Only the board Id(index 0) used to calculate the associativity.
>> > +
>> > +       arm,associativity-reference-points = <1>;
>> > +       Only socket Id(index 1) used to calculate the associativity.
>> > +
>> >
>> > +==============================================================================
>> > +4 - Example dts
>> >
>> > +==============================================================================
>> > +
>> > +Example: 2 Node system consists of 2 boards and each board having one
>> > socket
>> > +and 8 core in each socket.
>> > +
>> > +       arm,associativity-reference-points = <0>;
>> > +
>> > +       memory@00c00000 {
>> > +               device_type = "memory";
>> > +               reg = <0x0 0x00c00000 0x0 0x80000000>;
>> > +               /* board 0, socket 0, no specific core */
>> > +               arm,associativity = <0 0 0xffff>;
>> > +       };
>> > +
>> > +       memory@10000000000 {
>> > +               device_type = "memory";
>> > +               reg = <0x100 0x00000000 0x0 0x80000000>;
>> > +               /* board 1, socket 0, no specific core */
>> > +               arm,associativity = <1 0 0xffff>;
>> > +       };
>> > +
>> > +       cpus {
>> > +               #address-cells = <2>;
>> > +               #size-cells = <0>;
>> > +
>> > +               cpu@000 {
>> > +                       device_type = "cpu";
>> > +                       compatible =  "arm,armv8";
>> > +                       reg = <0x0 0x000>;
>> > +                       enable-method = "psci";
>> > +                       /* board 0, socket 0, core 0*/
>> > +                       arm,associativity = <0 0 0>;
>>
>> We should specify w.r.t. memory and CPUs how the property is expected to
>> be used (e.g. in the CPU nodes rather than the cpu-map, with separate
>> memory nodes, etc). The generic description of arm,associativity isn't
>> sufficient to limit confusion there.
>
> ok, will add the details like which nodes can use this property.
>
>>
>>
>> Thanks,
>> Mark.
>
>
> thanks
> Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-29  8:38             ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-29  8:38 UTC (permalink / raw)
  To: linux-arm-kernel

(sending again, by mistake it was set to html mode)

On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
<gpkulkarni@gmail.com> wrote:
> Hi Mark,
>
> I have tried to answer your comments, in the meantime we are waiting for Ben
> to share the details.
>
> On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland@arm.com> wrote:
>>
>> Hi,
>>
>> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>> > DT bindings for numa map for memory, cores and IOs using
>> > arm,associativity device node property.
>>
>> Given this is just a copy of ibm,associativity, I'm not sure I see much
>> point in renaming the properties.
>>
>> However, (somewhat counter to that) I'm also concerned that this isn't
>> sufficient for systems we're beginning to see today (more on that
>> below), so I don't think a simple copy of ibm,associativity is good
>> enough.
>
> it is just copy right now, however it can evolve when we come across more
> arm64 numa platforms
>>
>>
>> >
>> > Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
>> > ---
>> >  Documentation/devicetree/bindings/arm/numa.txt | 212
>> > +++++++++++++++++++++++++
>> >  1 file changed, 212 insertions(+)
>> >  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>> >
>> > diff --git a/Documentation/devicetree/bindings/arm/numa.txt
>> > b/Documentation/devicetree/bindings/arm/numa.txt
>> > new file mode 100644
>> > index 0000000..dc3ef86
>> > --- /dev/null
>> > +++ b/Documentation/devicetree/bindings/arm/numa.txt
>> > @@ -0,0 +1,212 @@
>> >
>> > +==============================================================================
>> > +NUMA binding description.
>> >
>> > +==============================================================================
>> > +
>> >
>> > +==============================================================================
>> > +1 - Introduction
>> >
>> > +==============================================================================
>> > +
>> > +Systems employing a Non Uniform Memory Access (NUMA) architecture
>> > contain
>> > +collections of hardware resources including processors, memory, and I/O
>> > buses,
>> > +that comprise what is commonly known as a NUMA node.
>> > +Processor accesses to memory within the local NUMA node is generally
>> > faster
>> > +than processor accesses to memory outside of the local NUMA node.
>> > +DT defines interfaces that allow the platform to convey NUMA node
>> > +topology information to OS.
>> > +
>> >
>> > +==============================================================================
>> > +2 - arm,associativity
>> >
>> > +==============================================================================
>> > +The mapping is done using arm,associativity device property.
>> > +this property needs to be present in every device node which needs to
>> > to be
>> > +mapped to numa nodes.
>>
>> Can't there be some inheritance? e.g. all devices on a bus with an
>> arm,associativity property being assumed to share that value?
>
> yes there is inheritance and respective bus drivers should take care of it,
> like pci driver does at present.
>>
>>
>> > +
>> > +arm,associativity property is set of 32-bit integers which defines
>> > level of
>>
>> s/set/list/ -- the order is important.
>
> ok
>>
>>
>> > +topology and boundary in the system at which a significant difference
>> > in
>> > +performance can be measured between cross-device accesses within
>> > +a single location and those spanning multiple locations.
>> > +The first cell always contains the broadest subdivision within the
>> > system,
>> > +while the last cell enumerates the individual devices, such as an SMT
>> > thread
>> > +of a CPU, or a bus bridge within an SoC".
>>
>> While this gives us some hierarchy, this doesn't seem to encode relative
>> distances at all. That seems like an oversight.
>
>
> distance is computed, will add the details to document.
> local nodes will have distance as 10(LOCAL_DISTANCE) and every level, the
> distance multiplies by 2.
> for example, for level 1 numa topology, distance from local node to remote
> node will be 20.
>
>>
>>
>> Additionally, I'm somewhat unclear on how what you'd be expected to
>> provide for this property in cases like ring or mesh interconnects,
>> where there isn't a strict hierarchy (see systems with ARM's own CCN, or
>> Tilera's TILE-Mx), but there is some measure of closeness.
>
>
> IIUC, as per ARMs CCN architecture, all core/clusters are at equal distance
> of DDR, i dont see any NUMA topology.
> however, if there are 2 SoC connected thorough the CCN, then it is very much
> similar to cavium topology.
>
>> Must all of these have the same length? If so, why not have a
>> #(whatever)-cells property in the root to describe the expected length?
>> If not, how are they to be interpreted relative to each other?
>
>
> yes, all are of default size.
> IMHO, there is no need to add cells property.
>>
>>
>> > +
>> > +ex:
>>
>> s/ex/Example:/, please. There's no need to contract that.
>>
>> > +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
>> > +       arm,associativity = <0 0 0 0 0>;
>> > +
>> >
>> > +==============================================================================
>> > +3 - arm,associativity-reference-points
>> >
>> > +==============================================================================
>> > +This property is a set of 32-bit integers, each representing an index
>> > into
>>
>> Likeise, s/set/list/
>
> ok
>>
>>
>> > +the arm,associativity nodes. The first integer is the most significant
>> > +NUMA boundary and the following are progressively less significant
>> > boundaries.
>> > +There can be more than one level of NUMA.
>>
>> I'm not clear on why this is necessary; the arm,associativity property
>> is already ordered from most significant to least significant per its
>> description.
>
>
> first entry in arm,associativity-reference-points is used to find which
> entry in associativity defines node id.
> also entries in arm,associativity-reference-points defines,
> how many entries(depth) in associativity can be used to calculate node
> distance
> in both level 1 and  multi level(hierarchical) numa topology.
>
>>
>>
>> What does this property achieve?
>>
>> The description also doesn't describe where this property is expected to
>> live. The example isn't sufficient to disambiguate that, especially as
>> it seems like a trivial case.
>
> sure, will add one more example to describe the
> arm,associativity-reference-points
>>
>>
>> Is this only expected at the root of the tree? Can it be re-defined in
>> sub-nodes?
>
> yes it is defined only at the root.
>>
>>
>> > +
>> > +Ex:
>>
>> s/Ex/Example:/, please
>
> sure.
>>
>>
>> > +       arm,associativity-reference-points = <0 1>;
>> > +       The board Id(index 0) used first to calculate the associativity
>> > (node
>> > +       distance), then follows the  socket id(index 1).
>> > +
>> > +       arm,associativity-reference-points = <1 0>;
>> > +       The socket Id(index 1) used first to calculate the
>> > associativity,
>> > +       then follows the board id(index 0).
>> > +
>> > +       arm,associativity-reference-points = <0>;
>> > +       Only the board Id(index 0) used to calculate the associativity.
>> > +
>> > +       arm,associativity-reference-points = <1>;
>> > +       Only socket Id(index 1) used to calculate the associativity.
>> > +
>> >
>> > +==============================================================================
>> > +4 - Example dts
>> >
>> > +==============================================================================
>> > +
>> > +Example: 2 Node system consists of 2 boards and each board having one
>> > socket
>> > +and 8 core in each socket.
>> > +
>> > +       arm,associativity-reference-points = <0>;
>> > +
>> > +       memory at 00c00000 {
>> > +               device_type = "memory";
>> > +               reg = <0x0 0x00c00000 0x0 0x80000000>;
>> > +               /* board 0, socket 0, no specific core */
>> > +               arm,associativity = <0 0 0xffff>;
>> > +       };
>> > +
>> > +       memory at 10000000000 {
>> > +               device_type = "memory";
>> > +               reg = <0x100 0x00000000 0x0 0x80000000>;
>> > +               /* board 1, socket 0, no specific core */
>> > +               arm,associativity = <1 0 0xffff>;
>> > +       };
>> > +
>> > +       cpus {
>> > +               #address-cells = <2>;
>> > +               #size-cells = <0>;
>> > +
>> > +               cpu at 000 {
>> > +                       device_type = "cpu";
>> > +                       compatible =  "arm,armv8";
>> > +                       reg = <0x0 0x000>;
>> > +                       enable-method = "psci";
>> > +                       /* board 0, socket 0, core 0*/
>> > +                       arm,associativity = <0 0 0>;
>>
>> We should specify w.r.t. memory and CPUs how the property is expected to
>> be used (e.g. in the CPU nodes rather than the cpu-map, with separate
>> memory nodes, etc). The generic description of arm,associativity isn't
>> sufficient to limit confusion there.
>
> ok, will add the details like which nodes can use this property.
>
>>
>>
>> Thanks,
>> Mark.
>
>
> thanks
> Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 1/4] arm64, numa: adding numa support for arm64 platforms.
  2015-09-03 10:13           ` Will Deacon
@ 2015-09-29  8:43               ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-29  8:43 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ganapatrao Kulkarni, Catalin Marinas,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A, Leif Lindholm,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A, Al Stone, Arnd Bergmann,
	Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala,
	Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Robert

Hi Will/Catalin,


On Thu, Sep 3, 2015 at 3:43 PM, Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org> wrote:
> On Thu, Sep 03, 2015 at 10:52:02AM +0100, Ganapatrao Kulkarni wrote:
>> On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
>> <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org> wrote:
>> > Adding numa support for arm64 based platforms.
>> > This patch adds by default the dummy numa node and
>> > maps all memory and cpus to node 0.
>> > using this patch, numa can be simulated on single node arm64 platforms.
>> >
>> > Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
>
> [trimmed ~850 lines of context]
>
>> can you please review this patch.
>
> It's the middle of the merge window, please be patient.
can you please share your feedback?
>
> Will

thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 1/4] arm64, numa: adding numa support for arm64 platforms.
@ 2015-09-29  8:43               ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-29  8:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will/Catalin,


On Thu, Sep 3, 2015 at 3:43 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Thu, Sep 03, 2015 at 10:52:02AM +0100, Ganapatrao Kulkarni wrote:
>> On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
>> <gkulkarni@caviumnetworks.com> wrote:
>> > Adding numa support for arm64 based platforms.
>> > This patch adds by default the dummy numa node and
>> > maps all memory and cpus to node 0.
>> > using this patch, numa can be simulated on single node arm64 platforms.
>> >
>> > Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
>
> [trimmed ~850 lines of context]
>
>> can you please review this patch.
>
> It's the middle of the merge window, please be patient.
can you please share your feedback?
>
> Will

thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-09-29  8:38             ` Ganapatrao Kulkarni
@ 2015-09-29  9:42                 ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-09-29  9:42 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, Mark Rutland, Prasun Kapoor, Rob Herring,
	Leizhen (ThunderTown)
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4, Pawel Moll,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	galak-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org

On Tue, 2015-09-29 at 14:08 +0530, Ganapatrao Kulkarni wrote:
> (sending again, by mistake it was set to html mode)

I sorry, I was trying to get OpenPower to move faster & release PAPR
publicly but it looks like it's going to take a bit longer, so I'll try
to write a summary in the next couple of days.

Ben.

> On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
> <gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > Hi Mark,
> > 
> > I have tried to answer your comments, in the meantime we are
> > waiting for Ben
> > to share the details.
> > 
> > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org
> > > wrote:
> > > 
> > > Hi,
> > > 
> > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
> > > wrote:
> > > > DT bindings for numa map for memory, cores and IOs using
> > > > arm,associativity device node property.
> > > 
> > > Given this is just a copy of ibm,associativity, I'm not sure I
> > > see much
> > > point in renaming the properties.
> > > 
> > > However, (somewhat counter to that) I'm also concerned that this
> > > isn't
> > > sufficient for systems we're beginning to see today (more on that
> > > below), so I don't think a simple copy of ibm,associativity is
> > > good
> > > enough.
> > 
> > it is just copy right now, however it can evolve when we come
> > across more
> > arm64 numa platforms
> > > 
> > > 
> > > > 
> > > > Signed-off-by: Ganapatrao Kulkarni <
> > > > gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
> > > > ---
> > > >  Documentation/devicetree/bindings/arm/numa.txt | 212
> > > > +++++++++++++++++++++++++
> > > >  1 file changed, 212 insertions(+)
> > > >  create mode 100644
> > > > Documentation/devicetree/bindings/arm/numa.txt
> > > > 
> > > > diff --git a/Documentation/devicetree/bindings/arm/numa.txt
> > > > b/Documentation/devicetree/bindings/arm/numa.txt
> > > > new file mode 100644
> > > > index 0000000..dc3ef86
> > > > --- /dev/null
> > > > +++ b/Documentation/devicetree/bindings/arm/numa.txt
> > > > @@ -0,0 +1,212 @@
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +NUMA binding description.
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +1 - Introduction
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > +Systems employing a Non Uniform Memory Access (NUMA)
> > > > architecture
> > > > contain
> > > > +collections of hardware resources including processors,
> > > > memory, and I/O
> > > > buses,
> > > > +that comprise what is commonly known as a NUMA node.
> > > > +Processor accesses to memory within the local NUMA node is
> > > > generally
> > > > faster
> > > > +than processor accesses to memory outside of the local NUMA
> > > > node.
> > > > +DT defines interfaces that allow the platform to convey NUMA
> > > > node
> > > > +topology information to OS.
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +2 - arm,associativity
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +The mapping is done using arm,associativity device property.
> > > > +this property needs to be present in every device node which
> > > > needs to
> > > > to be
> > > > +mapped to numa nodes.
> > > 
> > > Can't there be some inheritance? e.g. all devices on a bus with
> > > an
> > > arm,associativity property being assumed to share that value?
> > 
> > yes there is inheritance and respective bus drivers should take
> > care of it,
> > like pci driver does at present.
> > > 
> > > 
> > > > +
> > > > +arm,associativity property is set of 32-bit integers which
> > > > defines
> > > > level of
> > > 
> > > s/set/list/ -- the order is important.
> > 
> > ok
> > > 
> > > 
> > > > +topology and boundary in the system at which a significant
> > > > difference
> > > > in
> > > > +performance can be measured between cross-device accesses
> > > > within
> > > > +a single location and those spanning multiple locations.
> > > > +The first cell always contains the broadest subdivision within
> > > > the
> > > > system,
> > > > +while the last cell enumerates the individual devices, such as
> > > > an SMT
> > > > thread
> > > > +of a CPU, or a bus bridge within an SoC".
> > > 
> > > While this gives us some hierarchy, this doesn't seem to encode
> > > relative
> > > distances at all. That seems like an oversight.
> > 
> > 
> > distance is computed, will add the details to document.
> > local nodes will have distance as 10(LOCAL_DISTANCE) and every
> > level, the
> > distance multiplies by 2.
> > for example, for level 1 numa topology, distance from local node to
> > remote
> > node will be 20.
> > 
> > > 
> > > 
> > > Additionally, I'm somewhat unclear on how what you'd be expected
> > > to
> > > provide for this property in cases like ring or mesh
> > > interconnects,
> > > where there isn't a strict hierarchy (see systems with ARM's own
> > > CCN, or
> > > Tilera's TILE-Mx), but there is some measure of closeness.
> > 
> > 
> > IIUC, as per ARMs CCN architecture, all core/clusters are at equal
> > distance
> > of DDR, i dont see any NUMA topology.
> > however, if there are 2 SoC connected thorough the CCN, then it is
> > very much
> > similar to cavium topology.
> > 
> > > Must all of these have the same length? If so, why not have a
> > > #(whatever)-cells property in the root to describe the expected
> > > length?
> > > If not, how are they to be interpreted relative to each other?
> > 
> > 
> > yes, all are of default size.
> > IMHO, there is no need to add cells property.
> > > 
> > > 
> > > > +
> > > > +ex:
> > > 
> > > s/ex/Example:/, please. There's no need to contract that.
> > > 
> > > > +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
> > > > +       arm,associativity = <0 0 0 0 0>;
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +3 - arm,associativity-reference-points
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +This property is a set of 32-bit integers, each representing
> > > > an index
> > > > into
> > > 
> > > Likeise, s/set/list/
> > 
> > ok
> > > 
> > > 
> > > > +the arm,associativity nodes. The first integer is the most
> > > > significant
> > > > +NUMA boundary and the following are progressively less
> > > > significant
> > > > boundaries.
> > > > +There can be more than one level of NUMA.
> > > 
> > > I'm not clear on why this is necessary; the arm,associativity
> > > property
> > > is already ordered from most significant to least significant per
> > > its
> > > description.
> > 
> > 
> > first entry in arm,associativity-reference-points is used to find
> > which
> > entry in associativity defines node id.
> > also entries in arm,associativity-reference-points defines,
> > how many entries(depth) in associativity can be used to calculate
> > node
> > distance
> > in both level 1 and  multi level(hierarchical) numa topology.
> > 
> > > 
> > > 
> > > What does this property achieve?
> > > 
> > > The description also doesn't describe where this property is
> > > expected to
> > > live. The example isn't sufficient to disambiguate that,
> > > especially as
> > > it seems like a trivial case.
> > 
> > sure, will add one more example to describe the
> > arm,associativity-reference-points
> > > 
> > > 
> > > Is this only expected at the root of the tree? Can it be re
> > > -defined in
> > > sub-nodes?
> > 
> > yes it is defined only at the root.
> > > 
> > > 
> > > > +
> > > > +Ex:
> > > 
> > > s/Ex/Example:/, please
> > 
> > sure.
> > > 
> > > 
> > > > +       arm,associativity-reference-points = <0 1>;
> > > > +       The board Id(index 0) used first to calculate the
> > > > associativity
> > > > (node
> > > > +       distance), then follows the  socket id(index 1).
> > > > +
> > > > +       arm,associativity-reference-points = <1 0>;
> > > > +       The socket Id(index 1) used first to calculate the
> > > > associativity,
> > > > +       then follows the board id(index 0).
> > > > +
> > > > +       arm,associativity-reference-points = <0>;
> > > > +       Only the board Id(index 0) used to calculate the
> > > > associativity.
> > > > +
> > > > +       arm,associativity-reference-points = <1>;
> > > > +       Only socket Id(index 1) used to calculate the
> > > > associativity.
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +4 - Example dts
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > +Example: 2 Node system consists of 2 boards and each board
> > > > having one
> > > > socket
> > > > +and 8 core in each socket.
> > > > +
> > > > +       arm,associativity-reference-points = <0>;
> > > > +
> > > > +       memory@00c00000 {
> > > > +               device_type = "memory";
> > > > +               reg = <0x0 0x00c00000 0x0 0x80000000>;
> > > > +               /* board 0, socket 0, no specific core */
> > > > +               arm,associativity = <0 0 0xffff>;
> > > > +       };
> > > > +
> > > > +       memory@10000000000 {
> > > > +               device_type = "memory";
> > > > +               reg = <0x100 0x00000000 0x0 0x80000000>;
> > > > +               /* board 1, socket 0, no specific core */
> > > > +               arm,associativity = <1 0 0xffff>;
> > > > +       };
> > > > +
> > > > +       cpus {
> > > > +               #address-cells = <2>;
> > > > +               #size-cells = <0>;
> > > > +
> > > > +               cpu@000 {
> > > > +                       device_type = "cpu";
> > > > +                       compatible =  "arm,armv8";
> > > > +                       reg = <0x0 0x000>;
> > > > +                       enable-method = "psci";
> > > > +                       /* board 0, socket 0, core 0*/
> > > > +                       arm,associativity = <0 0 0>;
> > > 
> > > We should specify w.r.t. memory and CPUs how the property is
> > > expected to
> > > be used (e.g. in the CPU nodes rather than the cpu-map, with
> > > separate
> > > memory nodes, etc). The generic description of arm,associativity
> > > isn't
> > > sufficient to limit confusion there.
> > 
> > ok, will add the details like which nodes can use this property.
> > 
> > > 
> > > 
> > > Thanks,
> > > Mark.
> > 
> > 
> > thanks
> > Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-29  9:42                 ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-09-29  9:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2015-09-29 at 14:08 +0530, Ganapatrao Kulkarni wrote:
> (sending again, by mistake it was set to html mode)

I sorry, I was trying to get OpenPower to move faster & release PAPR
publicly but it looks like it's going to take a bit longer, so I'll try
to write a summary in the next couple of days.

Ben.

> On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
> <gpkulkarni@gmail.com> wrote:
> > Hi Mark,
> > 
> > I have tried to answer your comments, in the meantime we are
> > waiting for Ben
> > to share the details.
> > 
> > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland@arm.com
> > > wrote:
> > > 
> > > Hi,
> > > 
> > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
> > > wrote:
> > > > DT bindings for numa map for memory, cores and IOs using
> > > > arm,associativity device node property.
> > > 
> > > Given this is just a copy of ibm,associativity, I'm not sure I
> > > see much
> > > point in renaming the properties.
> > > 
> > > However, (somewhat counter to that) I'm also concerned that this
> > > isn't
> > > sufficient for systems we're beginning to see today (more on that
> > > below), so I don't think a simple copy of ibm,associativity is
> > > good
> > > enough.
> > 
> > it is just copy right now, however it can evolve when we come
> > across more
> > arm64 numa platforms
> > > 
> > > 
> > > > 
> > > > Signed-off-by: Ganapatrao Kulkarni <
> > > > gkulkarni at caviumnetworks.com>
> > > > ---
> > > >  Documentation/devicetree/bindings/arm/numa.txt | 212
> > > > +++++++++++++++++++++++++
> > > >  1 file changed, 212 insertions(+)
> > > >  create mode 100644
> > > > Documentation/devicetree/bindings/arm/numa.txt
> > > > 
> > > > diff --git a/Documentation/devicetree/bindings/arm/numa.txt
> > > > b/Documentation/devicetree/bindings/arm/numa.txt
> > > > new file mode 100644
> > > > index 0000000..dc3ef86
> > > > --- /dev/null
> > > > +++ b/Documentation/devicetree/bindings/arm/numa.txt
> > > > @@ -0,0 +1,212 @@
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +NUMA binding description.
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +1 - Introduction
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > +Systems employing a Non Uniform Memory Access (NUMA)
> > > > architecture
> > > > contain
> > > > +collections of hardware resources including processors,
> > > > memory, and I/O
> > > > buses,
> > > > +that comprise what is commonly known as a NUMA node.
> > > > +Processor accesses to memory within the local NUMA node is
> > > > generally
> > > > faster
> > > > +than processor accesses to memory outside of the local NUMA
> > > > node.
> > > > +DT defines interfaces that allow the platform to convey NUMA
> > > > node
> > > > +topology information to OS.
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +2 - arm,associativity
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +The mapping is done using arm,associativity device property.
> > > > +this property needs to be present in every device node which
> > > > needs to
> > > > to be
> > > > +mapped to numa nodes.
> > > 
> > > Can't there be some inheritance? e.g. all devices on a bus with
> > > an
> > > arm,associativity property being assumed to share that value?
> > 
> > yes there is inheritance and respective bus drivers should take
> > care of it,
> > like pci driver does at present.
> > > 
> > > 
> > > > +
> > > > +arm,associativity property is set of 32-bit integers which
> > > > defines
> > > > level of
> > > 
> > > s/set/list/ -- the order is important.
> > 
> > ok
> > > 
> > > 
> > > > +topology and boundary in the system at which a significant
> > > > difference
> > > > in
> > > > +performance can be measured between cross-device accesses
> > > > within
> > > > +a single location and those spanning multiple locations.
> > > > +The first cell always contains the broadest subdivision within
> > > > the
> > > > system,
> > > > +while the last cell enumerates the individual devices, such as
> > > > an SMT
> > > > thread
> > > > +of a CPU, or a bus bridge within an SoC".
> > > 
> > > While this gives us some hierarchy, this doesn't seem to encode
> > > relative
> > > distances at all. That seems like an oversight.
> > 
> > 
> > distance is computed, will add the details to document.
> > local nodes will have distance as 10(LOCAL_DISTANCE) and every
> > level, the
> > distance multiplies by 2.
> > for example, for level 1 numa topology, distance from local node to
> > remote
> > node will be 20.
> > 
> > > 
> > > 
> > > Additionally, I'm somewhat unclear on how what you'd be expected
> > > to
> > > provide for this property in cases like ring or mesh
> > > interconnects,
> > > where there isn't a strict hierarchy (see systems with ARM's own
> > > CCN, or
> > > Tilera's TILE-Mx), but there is some measure of closeness.
> > 
> > 
> > IIUC, as per ARMs CCN architecture, all core/clusters are at equal
> > distance
> > of DDR, i dont see any NUMA topology.
> > however, if there are 2 SoC connected thorough the CCN, then it is
> > very much
> > similar to cavium topology.
> > 
> > > Must all of these have the same length? If so, why not have a
> > > #(whatever)-cells property in the root to describe the expected
> > > length?
> > > If not, how are they to be interpreted relative to each other?
> > 
> > 
> > yes, all are of default size.
> > IMHO, there is no need to add cells property.
> > > 
> > > 
> > > > +
> > > > +ex:
> > > 
> > > s/ex/Example:/, please. There's no need to contract that.
> > > 
> > > > +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
> > > > +       arm,associativity = <0 0 0 0 0>;
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +3 - arm,associativity-reference-points
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +This property is a set of 32-bit integers, each representing
> > > > an index
> > > > into
> > > 
> > > Likeise, s/set/list/
> > 
> > ok
> > > 
> > > 
> > > > +the arm,associativity nodes. The first integer is the most
> > > > significant
> > > > +NUMA boundary and the following are progressively less
> > > > significant
> > > > boundaries.
> > > > +There can be more than one level of NUMA.
> > > 
> > > I'm not clear on why this is necessary; the arm,associativity
> > > property
> > > is already ordered from most significant to least significant per
> > > its
> > > description.
> > 
> > 
> > first entry in arm,associativity-reference-points is used to find
> > which
> > entry in associativity defines node id.
> > also entries in arm,associativity-reference-points defines,
> > how many entries(depth) in associativity can be used to calculate
> > node
> > distance
> > in both level 1 and  multi level(hierarchical) numa topology.
> > 
> > > 
> > > 
> > > What does this property achieve?
> > > 
> > > The description also doesn't describe where this property is
> > > expected to
> > > live. The example isn't sufficient to disambiguate that,
> > > especially as
> > > it seems like a trivial case.
> > 
> > sure, will add one more example to describe the
> > arm,associativity-reference-points
> > > 
> > > 
> > > Is this only expected at the root of the tree? Can it be re
> > > -defined in
> > > sub-nodes?
> > 
> > yes it is defined only at the root.
> > > 
> > > 
> > > > +
> > > > +Ex:
> > > 
> > > s/Ex/Example:/, please
> > 
> > sure.
> > > 
> > > 
> > > > +       arm,associativity-reference-points = <0 1>;
> > > > +       The board Id(index 0) used first to calculate the
> > > > associativity
> > > > (node
> > > > +       distance), then follows the  socket id(index 1).
> > > > +
> > > > +       arm,associativity-reference-points = <1 0>;
> > > > +       The socket Id(index 1) used first to calculate the
> > > > associativity,
> > > > +       then follows the board id(index 0).
> > > > +
> > > > +       arm,associativity-reference-points = <0>;
> > > > +       Only the board Id(index 0) used to calculate the
> > > > associativity.
> > > > +
> > > > +       arm,associativity-reference-points = <1>;
> > > > +       Only socket Id(index 1) used to calculate the
> > > > associativity.
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +4 - Example dts
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > +Example: 2 Node system consists of 2 boards and each board
> > > > having one
> > > > socket
> > > > +and 8 core in each socket.
> > > > +
> > > > +       arm,associativity-reference-points = <0>;
> > > > +
> > > > +       memory at 00c00000 {
> > > > +               device_type = "memory";
> > > > +               reg = <0x0 0x00c00000 0x0 0x80000000>;
> > > > +               /* board 0, socket 0, no specific core */
> > > > +               arm,associativity = <0 0 0xffff>;
> > > > +       };
> > > > +
> > > > +       memory at 10000000000 {
> > > > +               device_type = "memory";
> > > > +               reg = <0x100 0x00000000 0x0 0x80000000>;
> > > > +               /* board 1, socket 0, no specific core */
> > > > +               arm,associativity = <1 0 0xffff>;
> > > > +       };
> > > > +
> > > > +       cpus {
> > > > +               #address-cells = <2>;
> > > > +               #size-cells = <0>;
> > > > +
> > > > +               cpu at 000 {
> > > > +                       device_type = "cpu";
> > > > +                       compatible =  "arm,armv8";
> > > > +                       reg = <0x0 0x000>;
> > > > +                       enable-method = "psci";
> > > > +                       /* board 0, socket 0, core 0*/
> > > > +                       arm,associativity = <0 0 0>;
> > > 
> > > We should specify w.r.t. memory and CPUs how the property is
> > > expected to
> > > be used (e.g. in the CPU nodes rather than the cpu-map, with
> > > separate
> > > memory nodes, etc). The generic description of arm,associativity
> > > isn't
> > > sufficient to limit confusion there.
> > 
> > ok, will add the details like which nodes can use this property.
> > 
> > > 
> > > 
> > > Thanks,
> > > Mark.
> > 
> > 
> > thanks
> > Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-09-29  8:38             ` Ganapatrao Kulkarni
@ 2015-09-30  0:28                 ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-09-30  0:28 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, Mark Rutland, Prasun Kapoor, Rob Herring,
	Leizhen (ThunderTown)
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4, Pawel Moll,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	galak-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org

On Tue, 2015-09-29 at 14:08 +0530, Ganapatrao Kulkarni wrote:
> (sending again, by mistake it was set to html mode)

The representation consists of a hierarchy of domains, the idea being
that resources are grouped in domains of similar average performance
relative to each other.

The platform decides which "levels" of that hierarchy are significant. 

The "ibm,associativity" property allows to determine the associatitivy
between two resources (ie nodes) at a given level.

Unfortunately that property went through changes, so another property
in the DT (ibm,architecture-vec-5) contains, among a bunch of other
things, a bit indicating which form of the ibm,associativity property
is used. I'm going to stick to the new "form 1" in this description.

The ibm,associativity contains one or more lists of numbers (32-bit
cells), which represent the domains:

	< C1 , L1_1, L1_2, ... , C2, L2_1, L2_2, ... >

Where C1 (count 1) is the number of items for list 1, and L1_1,
L1_2, ... L1_C1 are the items for list 1, and same for C2/L2.

The entries in those lists are domain numbers from the highest level of
grouping to the lowest (successive numbers are sub divisions)
for example drawer#, socket#, chip#, core#... with the lowest level
being the actual resource itself. So within a domain that last number
is generally unique.

Different resources can have different number of levels, for example if
we have a grouping of node,socket,chip,core, a CPU core node would have
a list with all 4 but a memory controller on a chip might have only the
first 3.

This is an important statement in the spec:

<<
The user of this information is cautioned not to imply
any specific physical/logical significance of the various intermediate
levels.
>>

We can have multiple lists because a given resource can be connected
via multiple path in the same platform.

That means that to properly calculate the distance to another resource,
all the path need to be looked at (assuming the HW will pick the
shortest).

Additionally, to help the OS, another property "ibm,associativity
-reference-points" property indicates which levels (which indices in
the above lists) are of biggest significance to the platform. This can
typically be used by an OS to decide what to consider a "NUMA node"
if the OS cannot operate on distances alone. This is a list of 1-based
numbers representing indices in the associativity list. They should
be in order of significance of the boundary.

Finally, the ibm,max-associativity-domains (in the /rtas node on
pseries) is an array of cells < C, M1, M2, ... MC > (first is
count) containing for each domain/level the max number supported
by the platform.

Ben.

> On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
> <gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > Hi Mark,
> > 
> > I have tried to answer your comments, in the meantime we are
> > waiting for Ben
> > to share the details.
> > 
> > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org
> > > wrote:
> > > 
> > > Hi,
> > > 
> > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
> > > wrote:
> > > > DT bindings for numa map for memory, cores and IOs using
> > > > arm,associativity device node property.
> > > 
> > > Given this is just a copy of ibm,associativity, I'm not sure I
> > > see much
> > > point in renaming the properties.
> > > 
> > > However, (somewhat counter to that) I'm also concerned that this
> > > isn't
> > > sufficient for systems we're beginning to see today (more on that
> > > below), so I don't think a simple copy of ibm,associativity is
> > > good
> > > enough.
> > 
> > it is just copy right now, however it can evolve when we come
> > across more
> > arm64 numa platforms
> > > 
> > > 
> > > > 
> > > > Signed-off-by: Ganapatrao Kulkarni <
> > > > gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
> > > > ---
> > > >  Documentation/devicetree/bindings/arm/numa.txt | 212
> > > > +++++++++++++++++++++++++
> > > >  1 file changed, 212 insertions(+)
> > > >  create mode 100644
> > > > Documentation/devicetree/bindings/arm/numa.txt
> > > > 
> > > > diff --git a/Documentation/devicetree/bindings/arm/numa.txt
> > > > b/Documentation/devicetree/bindings/arm/numa.txt
> > > > new file mode 100644
> > > > index 0000000..dc3ef86
> > > > --- /dev/null
> > > > +++ b/Documentation/devicetree/bindings/arm/numa.txt
> > > > @@ -0,0 +1,212 @@
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +NUMA binding description.
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +1 - Introduction
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > +Systems employing a Non Uniform Memory Access (NUMA)
> > > > architecture
> > > > contain
> > > > +collections of hardware resources including processors,
> > > > memory, and I/O
> > > > buses,
> > > > +that comprise what is commonly known as a NUMA node.
> > > > +Processor accesses to memory within the local NUMA node is
> > > > generally
> > > > faster
> > > > +than processor accesses to memory outside of the local NUMA
> > > > node.
> > > > +DT defines interfaces that allow the platform to convey NUMA
> > > > node
> > > > +topology information to OS.
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +2 - arm,associativity
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +The mapping is done using arm,associativity device property.
> > > > +this property needs to be present in every device node which
> > > > needs to
> > > > to be
> > > > +mapped to numa nodes.
> > > 
> > > Can't there be some inheritance? e.g. all devices on a bus with
> > > an
> > > arm,associativity property being assumed to share that value?
> > 
> > yes there is inheritance and respective bus drivers should take
> > care of it,
> > like pci driver does at present.
> > > 
> > > 
> > > > +
> > > > +arm,associativity property is set of 32-bit integers which
> > > > defines
> > > > level of
> > > 
> > > s/set/list/ -- the order is important.
> > 
> > ok
> > > 
> > > 
> > > > +topology and boundary in the system at which a significant
> > > > difference
> > > > in
> > > > +performance can be measured between cross-device accesses
> > > > within
> > > > +a single location and those spanning multiple locations.
> > > > +The first cell always contains the broadest subdivision within
> > > > the
> > > > system,
> > > > +while the last cell enumerates the individual devices, such as
> > > > an SMT
> > > > thread
> > > > +of a CPU, or a bus bridge within an SoC".
> > > 
> > > While this gives us some hierarchy, this doesn't seem to encode
> > > relative
> > > distances at all. That seems like an oversight.
> > 
> > 
> > distance is computed, will add the details to document.
> > local nodes will have distance as 10(LOCAL_DISTANCE) and every
> > level, the
> > distance multiplies by 2.
> > for example, for level 1 numa topology, distance from local node to
> > remote
> > node will be 20.
> > 
> > > 
> > > 
> > > Additionally, I'm somewhat unclear on how what you'd be expected
> > > to
> > > provide for this property in cases like ring or mesh
> > > interconnects,
> > > where there isn't a strict hierarchy (see systems with ARM's own
> > > CCN, or
> > > Tilera's TILE-Mx), but there is some measure of closeness.
> > 
> > 
> > IIUC, as per ARMs CCN architecture, all core/clusters are at equal
> > distance
> > of DDR, i dont see any NUMA topology.
> > however, if there are 2 SoC connected thorough the CCN, then it is
> > very much
> > similar to cavium topology.
> > 
> > > Must all of these have the same length? If so, why not have a
> > > #(whatever)-cells property in the root to describe the expected
> > > length?
> > > If not, how are they to be interpreted relative to each other?
> > 
> > 
> > yes, all are of default size.
> > IMHO, there is no need to add cells property.
> > > 
> > > 
> > > > +
> > > > +ex:
> > > 
> > > s/ex/Example:/, please. There's no need to contract that.
> > > 
> > > > +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
> > > > +       arm,associativity = <0 0 0 0 0>;
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +3 - arm,associativity-reference-points
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +This property is a set of 32-bit integers, each representing
> > > > an index
> > > > into
> > > 
> > > Likeise, s/set/list/
> > 
> > ok
> > > 
> > > 
> > > > +the arm,associativity nodes. The first integer is the most
> > > > significant
> > > > +NUMA boundary and the following are progressively less
> > > > significant
> > > > boundaries.
> > > > +There can be more than one level of NUMA.
> > > 
> > > I'm not clear on why this is necessary; the arm,associativity
> > > property
> > > is already ordered from most significant to least significant per
> > > its
> > > description.
> > 
> > 
> > first entry in arm,associativity-reference-points is used to find
> > which
> > entry in associativity defines node id.
> > also entries in arm,associativity-reference-points defines,
> > how many entries(depth) in associativity can be used to calculate
> > node
> > distance
> > in both level 1 and  multi level(hierarchical) numa topology.
> > 
> > > 
> > > 
> > > What does this property achieve?
> > > 
> > > The description also doesn't describe where this property is
> > > expected to
> > > live. The example isn't sufficient to disambiguate that,
> > > especially as
> > > it seems like a trivial case.
> > 
> > sure, will add one more example to describe the
> > arm,associativity-reference-points
> > > 
> > > 
> > > Is this only expected at the root of the tree? Can it be re
> > > -defined in
> > > sub-nodes?
> > 
> > yes it is defined only at the root.
> > > 
> > > 
> > > > +
> > > > +Ex:
> > > 
> > > s/Ex/Example:/, please
> > 
> > sure.
> > > 
> > > 
> > > > +       arm,associativity-reference-points = <0 1>;
> > > > +       The board Id(index 0) used first to calculate the
> > > > associativity
> > > > (node
> > > > +       distance), then follows the  socket id(index 1).
> > > > +
> > > > +       arm,associativity-reference-points = <1 0>;
> > > > +       The socket Id(index 1) used first to calculate the
> > > > associativity,
> > > > +       then follows the board id(index 0).
> > > > +
> > > > +       arm,associativity-reference-points = <0>;
> > > > +       Only the board Id(index 0) used to calculate the
> > > > associativity.
> > > > +
> > > > +       arm,associativity-reference-points = <1>;
> > > > +       Only socket Id(index 1) used to calculate the
> > > > associativity.
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +4 - Example dts
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > +Example: 2 Node system consists of 2 boards and each board
> > > > having one
> > > > socket
> > > > +and 8 core in each socket.
> > > > +
> > > > +       arm,associativity-reference-points = <0>;
> > > > +
> > > > +       memory@00c00000 {
> > > > +               device_type = "memory";
> > > > +               reg = <0x0 0x00c00000 0x0 0x80000000>;
> > > > +               /* board 0, socket 0, no specific core */
> > > > +               arm,associativity = <0 0 0xffff>;
> > > > +       };
> > > > +
> > > > +       memory@10000000000 {
> > > > +               device_type = "memory";
> > > > +               reg = <0x100 0x00000000 0x0 0x80000000>;
> > > > +               /* board 1, socket 0, no specific core */
> > > > +               arm,associativity = <1 0 0xffff>;
> > > > +       };
> > > > +
> > > > +       cpus {
> > > > +               #address-cells = <2>;
> > > > +               #size-cells = <0>;
> > > > +
> > > > +               cpu@000 {
> > > > +                       device_type = "cpu";
> > > > +                       compatible =  "arm,armv8";
> > > > +                       reg = <0x0 0x000>;
> > > > +                       enable-method = "psci";
> > > > +                       /* board 0, socket 0, core 0*/
> > > > +                       arm,associativity = <0 0 0>;
> > > 
> > > We should specify w.r.t. memory and CPUs how the property is
> > > expected to
> > > be used (e.g. in the CPU nodes rather than the cpu-map, with
> > > separate
> > > memory nodes, etc). The generic description of arm,associativity
> > > isn't
> > > sufficient to limit confusion there.
> > 
> > ok, will add the details like which nodes can use this property.
> > 
> > > 
> > > 
> > > Thanks,
> > > Mark.
> > 
> > 
> > thanks
> > Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-30  0:28                 ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-09-30  0:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2015-09-29 at 14:08 +0530, Ganapatrao Kulkarni wrote:
> (sending again, by mistake it was set to html mode)

The representation consists of a hierarchy of domains, the idea being
that resources are grouped in domains of similar average performance
relative to each other.

The platform decides which "levels" of that hierarchy are significant. 

The "ibm,associativity" property allows to determine the associatitivy
between two resources (ie nodes) at a given level.

Unfortunately that property went through changes, so another property
in the DT (ibm,architecture-vec-5) contains, among a bunch of other
things, a bit indicating which form of the ibm,associativity property
is used. I'm going to stick to the new "form 1" in this description.

The ibm,associativity contains one or more lists of numbers (32-bit
cells), which represent the domains:

	< C1 , L1_1, L1_2, ... , C2, L2_1, L2_2, ... >

Where C1 (count 1) is the number of items for list 1, and L1_1,
L1_2, ... L1_C1 are the items for list 1, and same for C2/L2.

The entries in those lists are domain numbers from the highest level of
grouping to the lowest (successive numbers are sub divisions)
for example drawer#, socket#, chip#, core#... with the lowest level
being the actual resource itself. So within a domain that last number
is generally unique.

Different resources can have different number of levels, for example if
we have a grouping of node,socket,chip,core, a CPU core node would have
a list with all 4 but a memory controller on a chip might have only the
first 3.

This is an important statement in the spec:

<<
The user of this information is cautioned not to imply
any specific physical/logical significance of the various intermediate
levels.
>>

We can have multiple lists because a given resource can be connected
via multiple path in the same platform.

That means that to properly calculate the distance to another resource,
all the path need to be looked at (assuming the HW will pick the
shortest).

Additionally, to help the OS, another property "ibm,associativity
-reference-points" property indicates which levels (which indices in
the above lists) are of biggest significance to the platform. This can
typically be used by an OS to decide what to consider a "NUMA node"
if the OS cannot operate on distances alone. This is a list of 1-based
numbers representing indices in the associativity list. They should
be in order of significance of the boundary.

Finally, the ibm,max-associativity-domains (in the /rtas node on
pseries) is an array of cells < C, M1, M2, ... MC > (first is
count) containing for each domain/level the max number supported
by the platform.

Ben.

> On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
> <gpkulkarni@gmail.com> wrote:
> > Hi Mark,
> > 
> > I have tried to answer your comments, in the meantime we are
> > waiting for Ben
> > to share the details.
> > 
> > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland@arm.com
> > > wrote:
> > > 
> > > Hi,
> > > 
> > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
> > > wrote:
> > > > DT bindings for numa map for memory, cores and IOs using
> > > > arm,associativity device node property.
> > > 
> > > Given this is just a copy of ibm,associativity, I'm not sure I
> > > see much
> > > point in renaming the properties.
> > > 
> > > However, (somewhat counter to that) I'm also concerned that this
> > > isn't
> > > sufficient for systems we're beginning to see today (more on that
> > > below), so I don't think a simple copy of ibm,associativity is
> > > good
> > > enough.
> > 
> > it is just copy right now, however it can evolve when we come
> > across more
> > arm64 numa platforms
> > > 
> > > 
> > > > 
> > > > Signed-off-by: Ganapatrao Kulkarni <
> > > > gkulkarni at caviumnetworks.com>
> > > > ---
> > > >  Documentation/devicetree/bindings/arm/numa.txt | 212
> > > > +++++++++++++++++++++++++
> > > >  1 file changed, 212 insertions(+)
> > > >  create mode 100644
> > > > Documentation/devicetree/bindings/arm/numa.txt
> > > > 
> > > > diff --git a/Documentation/devicetree/bindings/arm/numa.txt
> > > > b/Documentation/devicetree/bindings/arm/numa.txt
> > > > new file mode 100644
> > > > index 0000000..dc3ef86
> > > > --- /dev/null
> > > > +++ b/Documentation/devicetree/bindings/arm/numa.txt
> > > > @@ -0,0 +1,212 @@
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +NUMA binding description.
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +1 - Introduction
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > +Systems employing a Non Uniform Memory Access (NUMA)
> > > > architecture
> > > > contain
> > > > +collections of hardware resources including processors,
> > > > memory, and I/O
> > > > buses,
> > > > +that comprise what is commonly known as a NUMA node.
> > > > +Processor accesses to memory within the local NUMA node is
> > > > generally
> > > > faster
> > > > +than processor accesses to memory outside of the local NUMA
> > > > node.
> > > > +DT defines interfaces that allow the platform to convey NUMA
> > > > node
> > > > +topology information to OS.
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +2 - arm,associativity
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +The mapping is done using arm,associativity device property.
> > > > +this property needs to be present in every device node which
> > > > needs to
> > > > to be
> > > > +mapped to numa nodes.
> > > 
> > > Can't there be some inheritance? e.g. all devices on a bus with
> > > an
> > > arm,associativity property being assumed to share that value?
> > 
> > yes there is inheritance and respective bus drivers should take
> > care of it,
> > like pci driver does at present.
> > > 
> > > 
> > > > +
> > > > +arm,associativity property is set of 32-bit integers which
> > > > defines
> > > > level of
> > > 
> > > s/set/list/ -- the order is important.
> > 
> > ok
> > > 
> > > 
> > > > +topology and boundary in the system at which a significant
> > > > difference
> > > > in
> > > > +performance can be measured between cross-device accesses
> > > > within
> > > > +a single location and those spanning multiple locations.
> > > > +The first cell always contains the broadest subdivision within
> > > > the
> > > > system,
> > > > +while the last cell enumerates the individual devices, such as
> > > > an SMT
> > > > thread
> > > > +of a CPU, or a bus bridge within an SoC".
> > > 
> > > While this gives us some hierarchy, this doesn't seem to encode
> > > relative
> > > distances at all. That seems like an oversight.
> > 
> > 
> > distance is computed, will add the details to document.
> > local nodes will have distance as 10(LOCAL_DISTANCE) and every
> > level, the
> > distance multiplies by 2.
> > for example, for level 1 numa topology, distance from local node to
> > remote
> > node will be 20.
> > 
> > > 
> > > 
> > > Additionally, I'm somewhat unclear on how what you'd be expected
> > > to
> > > provide for this property in cases like ring or mesh
> > > interconnects,
> > > where there isn't a strict hierarchy (see systems with ARM's own
> > > CCN, or
> > > Tilera's TILE-Mx), but there is some measure of closeness.
> > 
> > 
> > IIUC, as per ARMs CCN architecture, all core/clusters are at equal
> > distance
> > of DDR, i dont see any NUMA topology.
> > however, if there are 2 SoC connected thorough the CCN, then it is
> > very much
> > similar to cavium topology.
> > 
> > > Must all of these have the same length? If so, why not have a
> > > #(whatever)-cells property in the root to describe the expected
> > > length?
> > > If not, how are they to be interpreted relative to each other?
> > 
> > 
> > yes, all are of default size.
> > IMHO, there is no need to add cells property.
> > > 
> > > 
> > > > +
> > > > +ex:
> > > 
> > > s/ex/Example:/, please. There's no need to contract that.
> > > 
> > > > +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
> > > > +       arm,associativity = <0 0 0 0 0>;
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +3 - arm,associativity-reference-points
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +This property is a set of 32-bit integers, each representing
> > > > an index
> > > > into
> > > 
> > > Likeise, s/set/list/
> > 
> > ok
> > > 
> > > 
> > > > +the arm,associativity nodes. The first integer is the most
> > > > significant
> > > > +NUMA boundary and the following are progressively less
> > > > significant
> > > > boundaries.
> > > > +There can be more than one level of NUMA.
> > > 
> > > I'm not clear on why this is necessary; the arm,associativity
> > > property
> > > is already ordered from most significant to least significant per
> > > its
> > > description.
> > 
> > 
> > first entry in arm,associativity-reference-points is used to find
> > which
> > entry in associativity defines node id.
> > also entries in arm,associativity-reference-points defines,
> > how many entries(depth) in associativity can be used to calculate
> > node
> > distance
> > in both level 1 and  multi level(hierarchical) numa topology.
> > 
> > > 
> > > 
> > > What does this property achieve?
> > > 
> > > The description also doesn't describe where this property is
> > > expected to
> > > live. The example isn't sufficient to disambiguate that,
> > > especially as
> > > it seems like a trivial case.
> > 
> > sure, will add one more example to describe the
> > arm,associativity-reference-points
> > > 
> > > 
> > > Is this only expected at the root of the tree? Can it be re
> > > -defined in
> > > sub-nodes?
> > 
> > yes it is defined only at the root.
> > > 
> > > 
> > > > +
> > > > +Ex:
> > > 
> > > s/Ex/Example:/, please
> > 
> > sure.
> > > 
> > > 
> > > > +       arm,associativity-reference-points = <0 1>;
> > > > +       The board Id(index 0) used first to calculate the
> > > > associativity
> > > > (node
> > > > +       distance), then follows the  socket id(index 1).
> > > > +
> > > > +       arm,associativity-reference-points = <1 0>;
> > > > +       The socket Id(index 1) used first to calculate the
> > > > associativity,
> > > > +       then follows the board id(index 0).
> > > > +
> > > > +       arm,associativity-reference-points = <0>;
> > > > +       Only the board Id(index 0) used to calculate the
> > > > associativity.
> > > > +
> > > > +       arm,associativity-reference-points = <1>;
> > > > +       Only socket Id(index 1) used to calculate the
> > > > associativity.
> > > > +
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +4 - Example dts
> > > > 
> > > > +==============================================================
> > > > ================
> > > > +
> > > > +Example: 2 Node system consists of 2 boards and each board
> > > > having one
> > > > socket
> > > > +and 8 core in each socket.
> > > > +
> > > > +       arm,associativity-reference-points = <0>;
> > > > +
> > > > +       memory at 00c00000 {
> > > > +               device_type = "memory";
> > > > +               reg = <0x0 0x00c00000 0x0 0x80000000>;
> > > > +               /* board 0, socket 0, no specific core */
> > > > +               arm,associativity = <0 0 0xffff>;
> > > > +       };
> > > > +
> > > > +       memory at 10000000000 {
> > > > +               device_type = "memory";
> > > > +               reg = <0x100 0x00000000 0x0 0x80000000>;
> > > > +               /* board 1, socket 0, no specific core */
> > > > +               arm,associativity = <1 0 0xffff>;
> > > > +       };
> > > > +
> > > > +       cpus {
> > > > +               #address-cells = <2>;
> > > > +               #size-cells = <0>;
> > > > +
> > > > +               cpu at 000 {
> > > > +                       device_type = "cpu";
> > > > +                       compatible =  "arm,armv8";
> > > > +                       reg = <0x0 0x000>;
> > > > +                       enable-method = "psci";
> > > > +                       /* board 0, socket 0, core 0*/
> > > > +                       arm,associativity = <0 0 0>;
> > > 
> > > We should specify w.r.t. memory and CPUs how the property is
> > > expected to
> > > be used (e.g. in the CPU nodes rather than the cpu-map, with
> > > separate
> > > memory nodes, etc). The generic description of arm,associativity
> > > isn't
> > > sufficient to limit confusion there.
> > 
> > ok, will add the details like which nodes can use this property.
> > 
> > > 
> > > 
> > > Thanks,
> > > Mark.
> > 
> > 
> > thanks
> > Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-09-30  0:28                 ` Benjamin Herrenschmidt
@ 2015-09-30 10:19                     ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-30 10:19 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Mark Rutland, Prasun Kapoor, Rob Herring, Leizhen (ThunderTown),
	Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org

Thanks Ben for the details.

On Wed, Sep 30, 2015 at 5:58 AM, Benjamin Herrenschmidt
<benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org> wrote:
> On Tue, 2015-09-29 at 14:08 +0530, Ganapatrao Kulkarni wrote:
>> (sending again, by mistake it was set to html mode)
>
> The representation consists of a hierarchy of domains, the idea being
> that resources are grouped in domains of similar average performance
> relative to each other.
>
> The platform decides which "levels" of that hierarchy are significant.
>
> The "ibm,associativity" property allows to determine the associatitivy
> between two resources (ie nodes) at a given level.
>
> Unfortunately that property went through changes, so another property
> in the DT (ibm,architecture-vec-5) contains, among a bunch of other
> things, a bit indicating which form of the ibm,associativity property
> is used. I'm going to stick to the new "form 1" in this description.
>
> The ibm,associativity contains one or more lists of numbers (32-bit
> cells), which represent the domains:
>
>         < C1 , L1_1, L1_2, ... , C2, L2_1, L2_2, ... >
>
> Where C1 (count 1) is the number of items for list 1, and L1_1,
> L1_2, ... L1_C1 are the items for list 1, and same for C2/L2.
can you please put some examples for more clarity.
>
> The entries in those lists are domain numbers from the highest level of
> grouping to the lowest (successive numbers are sub divisions)
> for example drawer#, socket#, chip#, core#... with the lowest level
> being the actual resource itself. So within a domain that last number
> is generally unique.
>
> Different resources can have different number of levels, for example if
> we have a grouping of node,socket,chip,core, a CPU core node would have
> a list with all 4 but a memory controller on a chip might have only the
> first 3.
can you please put some examples for more clarity.
>
> This is an important statement in the spec:
>
> <<
> The user of this information is cautioned not to imply
> any specific physical/logical significance of the various intermediate
> levels.
>>>
>
> We can have multiple lists because a given resource can be connected
> via multiple path in the same platform.
>
> That means that to properly calculate the distance to another resource,
> all the path need to be looked at (assuming the HW will pick the
> shortest).
>
> Additionally, to help the OS, another property "ibm,associativity
> -reference-points" property indicates which levels (which indices in
> the above lists) are of biggest significance to the platform. This can
> typically be used by an OS to decide what to consider a "NUMA node"
> if the OS cannot operate on distances alone. This is a list of 1-based
> numbers representing indices in the associativity list. They should
> be in order of significance of the boundary.
some examples please.
>
> Finally, the ibm,max-associativity-domains (in the /rtas node on
> pseries) is an array of cells < C, M1, M2, ... MC > (first is
> count) containing for each domain/level the max number supported
> by the platform.
max number of what/cpu?
how this helps?
please give some examples to understand this!
>
> Ben.
>
>> On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
>> <gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> > Hi Mark,
>> >
>> > I have tried to answer your comments, in the meantime we are
>> > waiting for Ben
>> > to share the details.
>> >
>> > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org
>> > > wrote:
>> > >
>> > > Hi,
>> > >
>> > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
>> > > wrote:
>> > > > DT bindings for numa map for memory, cores and IOs using
>> > > > arm,associativity device node property.
>> > >
>> > > Given this is just a copy of ibm,associativity, I'm not sure I
>> > > see much
>> > > point in renaming the properties.
>> > >
>> > > However, (somewhat counter to that) I'm also concerned that this
>> > > isn't
>> > > sufficient for systems we're beginning to see today (more on that
>> > > below), so I don't think a simple copy of ibm,associativity is
>> > > good
>> > > enough.
>> >
>> > it is just copy right now, however it can evolve when we come
>> > across more
>> > arm64 numa platforms
>> > >
>> > >
>> > > >
>> > > > Signed-off-by: Ganapatrao Kulkarni <
>> > > > gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
>> > > > ---
>> > > >  Documentation/devicetree/bindings/arm/numa.txt | 212
>> > > > +++++++++++++++++++++++++
>> > > >  1 file changed, 212 insertions(+)
>> > > >  create mode 100644
>> > > > Documentation/devicetree/bindings/arm/numa.txt
>> > > >
>> > > > diff --git a/Documentation/devicetree/bindings/arm/numa.txt
>> > > > b/Documentation/devicetree/bindings/arm/numa.txt
>> > > > new file mode 100644
>> > > > index 0000000..dc3ef86
>> > > > --- /dev/null
>> > > > +++ b/Documentation/devicetree/bindings/arm/numa.txt
>> > > > @@ -0,0 +1,212 @@
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +NUMA binding description.
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +1 - Introduction
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +
>> > > > +Systems employing a Non Uniform Memory Access (NUMA)
>> > > > architecture
>> > > > contain
>> > > > +collections of hardware resources including processors,
>> > > > memory, and I/O
>> > > > buses,
>> > > > +that comprise what is commonly known as a NUMA node.
>> > > > +Processor accesses to memory within the local NUMA node is
>> > > > generally
>> > > > faster
>> > > > +than processor accesses to memory outside of the local NUMA
>> > > > node.
>> > > > +DT defines interfaces that allow the platform to convey NUMA
>> > > > node
>> > > > +topology information to OS.
>> > > > +
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +2 - arm,associativity
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +The mapping is done using arm,associativity device property.
>> > > > +this property needs to be present in every device node which
>> > > > needs to
>> > > > to be
>> > > > +mapped to numa nodes.
>> > >
>> > > Can't there be some inheritance? e.g. all devices on a bus with
>> > > an
>> > > arm,associativity property being assumed to share that value?
>> >
>> > yes there is inheritance and respective bus drivers should take
>> > care of it,
>> > like pci driver does at present.
>> > >
>> > >
>> > > > +
>> > > > +arm,associativity property is set of 32-bit integers which
>> > > > defines
>> > > > level of
>> > >
>> > > s/set/list/ -- the order is important.
>> >
>> > ok
>> > >
>> > >
>> > > > +topology and boundary in the system at which a significant
>> > > > difference
>> > > > in
>> > > > +performance can be measured between cross-device accesses
>> > > > within
>> > > > +a single location and those spanning multiple locations.
>> > > > +The first cell always contains the broadest subdivision within
>> > > > the
>> > > > system,
>> > > > +while the last cell enumerates the individual devices, such as
>> > > > an SMT
>> > > > thread
>> > > > +of a CPU, or a bus bridge within an SoC".
>> > >
>> > > While this gives us some hierarchy, this doesn't seem to encode
>> > > relative
>> > > distances at all. That seems like an oversight.
>> >
>> >
>> > distance is computed, will add the details to document.
>> > local nodes will have distance as 10(LOCAL_DISTANCE) and every
>> > level, the
>> > distance multiplies by 2.
>> > for example, for level 1 numa topology, distance from local node to
>> > remote
>> > node will be 20.
>> >
>> > >
>> > >
>> > > Additionally, I'm somewhat unclear on how what you'd be expected
>> > > to
>> > > provide for this property in cases like ring or mesh
>> > > interconnects,
>> > > where there isn't a strict hierarchy (see systems with ARM's own
>> > > CCN, or
>> > > Tilera's TILE-Mx), but there is some measure of closeness.
>> >
>> >
>> > IIUC, as per ARMs CCN architecture, all core/clusters are at equal
>> > distance
>> > of DDR, i dont see any NUMA topology.
>> > however, if there are 2 SoC connected thorough the CCN, then it is
>> > very much
>> > similar to cavium topology.
>> >
>> > > Must all of these have the same length? If so, why not have a
>> > > #(whatever)-cells property in the root to describe the expected
>> > > length?
>> > > If not, how are they to be interpreted relative to each other?
>> >
>> >
>> > yes, all are of default size.
>> > IMHO, there is no need to add cells property.
>> > >
>> > >
>> > > > +
>> > > > +ex:
>> > >
>> > > s/ex/Example:/, please. There's no need to contract that.
>> > >
>> > > > +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
>> > > > +       arm,associativity = <0 0 0 0 0>;
>> > > > +
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +3 - arm,associativity-reference-points
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +This property is a set of 32-bit integers, each representing
>> > > > an index
>> > > > into
>> > >
>> > > Likeise, s/set/list/
>> >
>> > ok
>> > >
>> > >
>> > > > +the arm,associativity nodes. The first integer is the most
>> > > > significant
>> > > > +NUMA boundary and the following are progressively less
>> > > > significant
>> > > > boundaries.
>> > > > +There can be more than one level of NUMA.
>> > >
>> > > I'm not clear on why this is necessary; the arm,associativity
>> > > property
>> > > is already ordered from most significant to least significant per
>> > > its
>> > > description.
>> >
>> >
>> > first entry in arm,associativity-reference-points is used to find
>> > which
>> > entry in associativity defines node id.
>> > also entries in arm,associativity-reference-points defines,
>> > how many entries(depth) in associativity can be used to calculate
>> > node
>> > distance
>> > in both level 1 and  multi level(hierarchical) numa topology.
>> >
>> > >
>> > >
>> > > What does this property achieve?
>> > >
>> > > The description also doesn't describe where this property is
>> > > expected to
>> > > live. The example isn't sufficient to disambiguate that,
>> > > especially as
>> > > it seems like a trivial case.
>> >
>> > sure, will add one more example to describe the
>> > arm,associativity-reference-points
>> > >
>> > >
>> > > Is this only expected at the root of the tree? Can it be re
>> > > -defined in
>> > > sub-nodes?
>> >
>> > yes it is defined only at the root.
>> > >
>> > >
>> > > > +
>> > > > +Ex:
>> > >
>> > > s/Ex/Example:/, please
>> >
>> > sure.
>> > >
>> > >
>> > > > +       arm,associativity-reference-points = <0 1>;
>> > > > +       The board Id(index 0) used first to calculate the
>> > > > associativity
>> > > > (node
>> > > > +       distance), then follows the  socket id(index 1).
>> > > > +
>> > > > +       arm,associativity-reference-points = <1 0>;
>> > > > +       The socket Id(index 1) used first to calculate the
>> > > > associativity,
>> > > > +       then follows the board id(index 0).
>> > > > +
>> > > > +       arm,associativity-reference-points = <0>;
>> > > > +       Only the board Id(index 0) used to calculate the
>> > > > associativity.
>> > > > +
>> > > > +       arm,associativity-reference-points = <1>;
>> > > > +       Only socket Id(index 1) used to calculate the
>> > > > associativity.
>> > > > +
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +4 - Example dts
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +
>> > > > +Example: 2 Node system consists of 2 boards and each board
>> > > > having one
>> > > > socket
>> > > > +and 8 core in each socket.
>> > > > +
>> > > > +       arm,associativity-reference-points = <0>;
>> > > > +
>> > > > +       memory@00c00000 {
>> > > > +               device_type = "memory";
>> > > > +               reg = <0x0 0x00c00000 0x0 0x80000000>;
>> > > > +               /* board 0, socket 0, no specific core */
>> > > > +               arm,associativity = <0 0 0xffff>;
>> > > > +       };
>> > > > +
>> > > > +       memory@10000000000 {
>> > > > +               device_type = "memory";
>> > > > +               reg = <0x100 0x00000000 0x0 0x80000000>;
>> > > > +               /* board 1, socket 0, no specific core */
>> > > > +               arm,associativity = <1 0 0xffff>;
>> > > > +       };
>> > > > +
>> > > > +       cpus {
>> > > > +               #address-cells = <2>;
>> > > > +               #size-cells = <0>;
>> > > > +
>> > > > +               cpu@000 {
>> > > > +                       device_type = "cpu";
>> > > > +                       compatible =  "arm,armv8";
>> > > > +                       reg = <0x0 0x000>;
>> > > > +                       enable-method = "psci";
>> > > > +                       /* board 0, socket 0, core 0*/
>> > > > +                       arm,associativity = <0 0 0>;
>> > >
>> > > We should specify w.r.t. memory and CPUs how the property is
>> > > expected to
>> > > be used (e.g. in the CPU nodes rather than the cpu-map, with
>> > > separate
>> > > memory nodes, etc). The generic description of arm,associativity
>> > > isn't
>> > > sufficient to limit confusion there.
>> >
>> > ok, will add the details like which nodes can use this property.
>> >
>> > >
>> > >
>> > > Thanks,
>> > > Mark.
>> >
>> >
>> > thanks
>> > Ganapat
thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-30 10:19                     ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-30 10:19 UTC (permalink / raw)
  To: linux-arm-kernel

Thanks Ben for the details.

On Wed, Sep 30, 2015 at 5:58 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2015-09-29 at 14:08 +0530, Ganapatrao Kulkarni wrote:
>> (sending again, by mistake it was set to html mode)
>
> The representation consists of a hierarchy of domains, the idea being
> that resources are grouped in domains of similar average performance
> relative to each other.
>
> The platform decides which "levels" of that hierarchy are significant.
>
> The "ibm,associativity" property allows to determine the associatitivy
> between two resources (ie nodes) at a given level.
>
> Unfortunately that property went through changes, so another property
> in the DT (ibm,architecture-vec-5) contains, among a bunch of other
> things, a bit indicating which form of the ibm,associativity property
> is used. I'm going to stick to the new "form 1" in this description.
>
> The ibm,associativity contains one or more lists of numbers (32-bit
> cells), which represent the domains:
>
>         < C1 , L1_1, L1_2, ... , C2, L2_1, L2_2, ... >
>
> Where C1 (count 1) is the number of items for list 1, and L1_1,
> L1_2, ... L1_C1 are the items for list 1, and same for C2/L2.
can you please put some examples for more clarity.
>
> The entries in those lists are domain numbers from the highest level of
> grouping to the lowest (successive numbers are sub divisions)
> for example drawer#, socket#, chip#, core#... with the lowest level
> being the actual resource itself. So within a domain that last number
> is generally unique.
>
> Different resources can have different number of levels, for example if
> we have a grouping of node,socket,chip,core, a CPU core node would have
> a list with all 4 but a memory controller on a chip might have only the
> first 3.
can you please put some examples for more clarity.
>
> This is an important statement in the spec:
>
> <<
> The user of this information is cautioned not to imply
> any specific physical/logical significance of the various intermediate
> levels.
>>>
>
> We can have multiple lists because a given resource can be connected
> via multiple path in the same platform.
>
> That means that to properly calculate the distance to another resource,
> all the path need to be looked at (assuming the HW will pick the
> shortest).
>
> Additionally, to help the OS, another property "ibm,associativity
> -reference-points" property indicates which levels (which indices in
> the above lists) are of biggest significance to the platform. This can
> typically be used by an OS to decide what to consider a "NUMA node"
> if the OS cannot operate on distances alone. This is a list of 1-based
> numbers representing indices in the associativity list. They should
> be in order of significance of the boundary.
some examples please.
>
> Finally, the ibm,max-associativity-domains (in the /rtas node on
> pseries) is an array of cells < C, M1, M2, ... MC > (first is
> count) containing for each domain/level the max number supported
> by the platform.
max number of what/cpu?
how this helps?
please give some examples to understand this!
>
> Ben.
>
>> On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
>> <gpkulkarni@gmail.com> wrote:
>> > Hi Mark,
>> >
>> > I have tried to answer your comments, in the meantime we are
>> > waiting for Ben
>> > to share the details.
>> >
>> > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland@arm.com
>> > > wrote:
>> > >
>> > > Hi,
>> > >
>> > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
>> > > wrote:
>> > > > DT bindings for numa map for memory, cores and IOs using
>> > > > arm,associativity device node property.
>> > >
>> > > Given this is just a copy of ibm,associativity, I'm not sure I
>> > > see much
>> > > point in renaming the properties.
>> > >
>> > > However, (somewhat counter to that) I'm also concerned that this
>> > > isn't
>> > > sufficient for systems we're beginning to see today (more on that
>> > > below), so I don't think a simple copy of ibm,associativity is
>> > > good
>> > > enough.
>> >
>> > it is just copy right now, however it can evolve when we come
>> > across more
>> > arm64 numa platforms
>> > >
>> > >
>> > > >
>> > > > Signed-off-by: Ganapatrao Kulkarni <
>> > > > gkulkarni at caviumnetworks.com>
>> > > > ---
>> > > >  Documentation/devicetree/bindings/arm/numa.txt | 212
>> > > > +++++++++++++++++++++++++
>> > > >  1 file changed, 212 insertions(+)
>> > > >  create mode 100644
>> > > > Documentation/devicetree/bindings/arm/numa.txt
>> > > >
>> > > > diff --git a/Documentation/devicetree/bindings/arm/numa.txt
>> > > > b/Documentation/devicetree/bindings/arm/numa.txt
>> > > > new file mode 100644
>> > > > index 0000000..dc3ef86
>> > > > --- /dev/null
>> > > > +++ b/Documentation/devicetree/bindings/arm/numa.txt
>> > > > @@ -0,0 +1,212 @@
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +NUMA binding description.
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +1 - Introduction
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +
>> > > > +Systems employing a Non Uniform Memory Access (NUMA)
>> > > > architecture
>> > > > contain
>> > > > +collections of hardware resources including processors,
>> > > > memory, and I/O
>> > > > buses,
>> > > > +that comprise what is commonly known as a NUMA node.
>> > > > +Processor accesses to memory within the local NUMA node is
>> > > > generally
>> > > > faster
>> > > > +than processor accesses to memory outside of the local NUMA
>> > > > node.
>> > > > +DT defines interfaces that allow the platform to convey NUMA
>> > > > node
>> > > > +topology information to OS.
>> > > > +
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +2 - arm,associativity
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +The mapping is done using arm,associativity device property.
>> > > > +this property needs to be present in every device node which
>> > > > needs to
>> > > > to be
>> > > > +mapped to numa nodes.
>> > >
>> > > Can't there be some inheritance? e.g. all devices on a bus with
>> > > an
>> > > arm,associativity property being assumed to share that value?
>> >
>> > yes there is inheritance and respective bus drivers should take
>> > care of it,
>> > like pci driver does at present.
>> > >
>> > >
>> > > > +
>> > > > +arm,associativity property is set of 32-bit integers which
>> > > > defines
>> > > > level of
>> > >
>> > > s/set/list/ -- the order is important.
>> >
>> > ok
>> > >
>> > >
>> > > > +topology and boundary in the system at which a significant
>> > > > difference
>> > > > in
>> > > > +performance can be measured between cross-device accesses
>> > > > within
>> > > > +a single location and those spanning multiple locations.
>> > > > +The first cell always contains the broadest subdivision within
>> > > > the
>> > > > system,
>> > > > +while the last cell enumerates the individual devices, such as
>> > > > an SMT
>> > > > thread
>> > > > +of a CPU, or a bus bridge within an SoC".
>> > >
>> > > While this gives us some hierarchy, this doesn't seem to encode
>> > > relative
>> > > distances at all. That seems like an oversight.
>> >
>> >
>> > distance is computed, will add the details to document.
>> > local nodes will have distance as 10(LOCAL_DISTANCE) and every
>> > level, the
>> > distance multiplies by 2.
>> > for example, for level 1 numa topology, distance from local node to
>> > remote
>> > node will be 20.
>> >
>> > >
>> > >
>> > > Additionally, I'm somewhat unclear on how what you'd be expected
>> > > to
>> > > provide for this property in cases like ring or mesh
>> > > interconnects,
>> > > where there isn't a strict hierarchy (see systems with ARM's own
>> > > CCN, or
>> > > Tilera's TILE-Mx), but there is some measure of closeness.
>> >
>> >
>> > IIUC, as per ARMs CCN architecture, all core/clusters are at equal
>> > distance
>> > of DDR, i dont see any NUMA topology.
>> > however, if there are 2 SoC connected thorough the CCN, then it is
>> > very much
>> > similar to cavium topology.
>> >
>> > > Must all of these have the same length? If so, why not have a
>> > > #(whatever)-cells property in the root to describe the expected
>> > > length?
>> > > If not, how are they to be interpreted relative to each other?
>> >
>> >
>> > yes, all are of default size.
>> > IMHO, there is no need to add cells property.
>> > >
>> > >
>> > > > +
>> > > > +ex:
>> > >
>> > > s/ex/Example:/, please. There's no need to contract that.
>> > >
>> > > > +       /* board 0, socket 0, cluster 0, core 0  thread 0 */
>> > > > +       arm,associativity = <0 0 0 0 0>;
>> > > > +
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +3 - arm,associativity-reference-points
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +This property is a set of 32-bit integers, each representing
>> > > > an index
>> > > > into
>> > >
>> > > Likeise, s/set/list/
>> >
>> > ok
>> > >
>> > >
>> > > > +the arm,associativity nodes. The first integer is the most
>> > > > significant
>> > > > +NUMA boundary and the following are progressively less
>> > > > significant
>> > > > boundaries.
>> > > > +There can be more than one level of NUMA.
>> > >
>> > > I'm not clear on why this is necessary; the arm,associativity
>> > > property
>> > > is already ordered from most significant to least significant per
>> > > its
>> > > description.
>> >
>> >
>> > first entry in arm,associativity-reference-points is used to find
>> > which
>> > entry in associativity defines node id.
>> > also entries in arm,associativity-reference-points defines,
>> > how many entries(depth) in associativity can be used to calculate
>> > node
>> > distance
>> > in both level 1 and  multi level(hierarchical) numa topology.
>> >
>> > >
>> > >
>> > > What does this property achieve?
>> > >
>> > > The description also doesn't describe where this property is
>> > > expected to
>> > > live. The example isn't sufficient to disambiguate that,
>> > > especially as
>> > > it seems like a trivial case.
>> >
>> > sure, will add one more example to describe the
>> > arm,associativity-reference-points
>> > >
>> > >
>> > > Is this only expected at the root of the tree? Can it be re
>> > > -defined in
>> > > sub-nodes?
>> >
>> > yes it is defined only at the root.
>> > >
>> > >
>> > > > +
>> > > > +Ex:
>> > >
>> > > s/Ex/Example:/, please
>> >
>> > sure.
>> > >
>> > >
>> > > > +       arm,associativity-reference-points = <0 1>;
>> > > > +       The board Id(index 0) used first to calculate the
>> > > > associativity
>> > > > (node
>> > > > +       distance), then follows the  socket id(index 1).
>> > > > +
>> > > > +       arm,associativity-reference-points = <1 0>;
>> > > > +       The socket Id(index 1) used first to calculate the
>> > > > associativity,
>> > > > +       then follows the board id(index 0).
>> > > > +
>> > > > +       arm,associativity-reference-points = <0>;
>> > > > +       Only the board Id(index 0) used to calculate the
>> > > > associativity.
>> > > > +
>> > > > +       arm,associativity-reference-points = <1>;
>> > > > +       Only socket Id(index 1) used to calculate the
>> > > > associativity.
>> > > > +
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +4 - Example dts
>> > > >
>> > > > +==============================================================
>> > > > ================
>> > > > +
>> > > > +Example: 2 Node system consists of 2 boards and each board
>> > > > having one
>> > > > socket
>> > > > +and 8 core in each socket.
>> > > > +
>> > > > +       arm,associativity-reference-points = <0>;
>> > > > +
>> > > > +       memory at 00c00000 {
>> > > > +               device_type = "memory";
>> > > > +               reg = <0x0 0x00c00000 0x0 0x80000000>;
>> > > > +               /* board 0, socket 0, no specific core */
>> > > > +               arm,associativity = <0 0 0xffff>;
>> > > > +       };
>> > > > +
>> > > > +       memory at 10000000000 {
>> > > > +               device_type = "memory";
>> > > > +               reg = <0x100 0x00000000 0x0 0x80000000>;
>> > > > +               /* board 1, socket 0, no specific core */
>> > > > +               arm,associativity = <1 0 0xffff>;
>> > > > +       };
>> > > > +
>> > > > +       cpus {
>> > > > +               #address-cells = <2>;
>> > > > +               #size-cells = <0>;
>> > > > +
>> > > > +               cpu at 000 {
>> > > > +                       device_type = "cpu";
>> > > > +                       compatible =  "arm,armv8";
>> > > > +                       reg = <0x0 0x000>;
>> > > > +                       enable-method = "psci";
>> > > > +                       /* board 0, socket 0, core 0*/
>> > > > +                       arm,associativity = <0 0 0>;
>> > >
>> > > We should specify w.r.t. memory and CPUs how the property is
>> > > expected to
>> > > be used (e.g. in the CPU nodes rather than the cpu-map, with
>> > > separate
>> > > memory nodes, etc). The generic description of arm,associativity
>> > > isn't
>> > > sufficient to limit confusion there.
>> >
>> > ok, will add the details like which nodes can use this property.
>> >
>> > >
>> > >
>> > > Thanks,
>> > > Mark.
>> >
>> >
>> > thanks
>> > Ganapat
thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-09-29  8:38             ` Ganapatrao Kulkarni
@ 2015-09-30 10:53                 ` Mark Rutland
  -1 siblings, 0 replies; 94+ messages in thread
From: Mark Rutland @ 2015-09-30 10:53 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: Benjamin Herrenschmidt,
	Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Rob Herring,
	Leizhen (ThunderTown),
	Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A

On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni wrote:
> (sending again, by mistake it was set to html mode)
> 
> On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
> <gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > Hi Mark,
> >
> > I have tried to answer your comments, in the meantime we are waiting for Ben
> > to share the details.
> >
> > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
> >>
> >> Hi,
> >>
> >> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
> >> > DT bindings for numa map for memory, cores and IOs using
> >> > arm,associativity device node property.
> >>
> >> Given this is just a copy of ibm,associativity, I'm not sure I see much
> >> point in renaming the properties.
> >>
> >> However, (somewhat counter to that) I'm also concerned that this isn't
> >> sufficient for systems we're beginning to see today (more on that
> >> below), so I don't think a simple copy of ibm,associativity is good
> >> enough.
> >
> > it is just copy right now, however it can evolve when we come across more
> > arm64 numa platforms

Whatever we do I suspect we'll have to evolve it as new platforms
appear. As I mentioned there are contemporary NUMA ARM64 platforms (e.g.
those with CCN) that I don't think we can ignore now given we'll have to
cater for them.

> >> > +==============================================================================
> >> > +2 - arm,associativity
> >> >
> >> > +==============================================================================
> >> > +The mapping is done using arm,associativity device property.
> >> > +this property needs to be present in every device node which needs to
> >> > to be
> >> > +mapped to numa nodes.
> >>
> >> Can't there be some inheritance? e.g. all devices on a bus with an
> >> arm,associativity property being assumed to share that value?
> >
> > yes there is inheritance and respective bus drivers should take care of it,
> > like pci driver does at present.

Ok. 

That seems counter to my initial interpretation of the wording that the
property must be present on device nodes that need to be mapped to NUMA
nodes.

Is there any simple way of describing the set of nodes that need this
property?

> >> > +topology and boundary in the system at which a significant difference
> >> > in
> >> > +performance can be measured between cross-device accesses within
> >> > +a single location and those spanning multiple locations.
> >> > +The first cell always contains the broadest subdivision within the
> >> > system,
> >> > +while the last cell enumerates the individual devices, such as an SMT
> >> > thread
> >> > +of a CPU, or a bus bridge within an SoC".
> >>
> >> While this gives us some hierarchy, this doesn't seem to encode relative
> >> distances at all. That seems like an oversight.
> >
> >
> > distance is computed, will add the details to document.
> > local nodes will have distance as 10(LOCAL_DISTANCE) and every level, the
> > distance multiplies by 2.
> > for example, for level 1 numa topology, distance from local node to remote
> > node will be 20.

This seems arbitrary.

Why not always have this explicitly described?

> >> Additionally, I'm somewhat unclear on how what you'd be expected to
> >> provide for this property in cases like ring or mesh interconnects,
> >> where there isn't a strict hierarchy (see systems with ARM's own CCN, or
> >> Tilera's TILE-Mx), but there is some measure of closeness.
> >
> >
> > IIUC, as per ARMs CCN architecture, all core/clusters are at equal distance
> > of DDR, i dont see any NUMA topology.

The CCN is a ring interconnect, so CPU clusters (henceforth CPUs) can be
connected with differing distances to RAM instances (or devices).

Consider the simplified network below:

  +-------+      +--------+      +-------+
  | CPU 0 |------| DRAM A |------| CPU 1 |
  +-------+      +--------+      +-------+
      |                              |
      |                              |
  +--------+                     +--------+
  | DRAM B |                     | DRAM C |
  +--------+                     +--------+
      |                              |
      |                              |
  +-------+      +--------+      +-------+
  | CPU 2 |------| DRAM D |------| CPU 3 |
  +-------+      +--------+      +-------+

In this case CPUs and DRAMs are spaced evenly on the ring, but the
distance between an arbitrary CPU and DRAM is not uniform.

CPU 0 can access DRAM A or DRAM B with a single hop, but accesses to
DRAM C or DRAM D take three hops.

An access from CPU 0 to DRAM C could contend with accesses from CPU 1 to
DRAM D, as they share hops on the ring.

There is definitely a NUMA topology here, but there's not a strict
hierarchy. I don't see how you would represent this with the proposed
binding.

Likewise for the mesh networks (e.g. that of TILE-Mx)

> > however, if there are 2 SoC connected thorough the CCN, then it is very much
> > similar to cavium topology.
> >
> >> Must all of these have the same length? If so, why not have a
> >> #(whatever)-cells property in the root to describe the expected length?
> >> If not, how are they to be interpreted relative to each other?
> >
> >
> > yes, all are of default size.

Where that size is...?

> > IMHO, there is no need to add cells property.

That might be the case, but it's unclear from the documentation. I don't
see how one would parse / verify values currently.

> >> > +the arm,associativity nodes. The first integer is the most significant
> >> > +NUMA boundary and the following are progressively less significant
> >> > boundaries.
> >> > +There can be more than one level of NUMA.
> >>
> >> I'm not clear on why this is necessary; the arm,associativity property
> >> is already ordered from most significant to least significant per its
> >> description.
> >
> >
> > first entry in arm,associativity-reference-points is used to find which
> > entry in associativity defines node id.
> > also entries in arm,associativity-reference-points defines,
> > how many entries(depth) in associativity can be used to calculate node
> > distance
> > in both level 1 and  multi level(hierarchical) numa topology.

I think this needs a more thorough description; I don't follow the
current one.

> >> Is this only expected at the root of the tree? Can it be re-defined in
> >> sub-nodes?
> >
> > yes it is defined only at the root.

This needs to be stated explicitly.

I see that this being the case, *,associativity-reference-points would
be a more powerful property than the #(whatever)-cells property I
mentioned earlier, but a more thorough description is required.

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-30 10:53                 ` Mark Rutland
  0 siblings, 0 replies; 94+ messages in thread
From: Mark Rutland @ 2015-09-30 10:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni wrote:
> (sending again, by mistake it was set to html mode)
> 
> On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
> <gpkulkarni@gmail.com> wrote:
> > Hi Mark,
> >
> > I have tried to answer your comments, in the meantime we are waiting for Ben
> > to share the details.
> >
> > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> >>
> >> Hi,
> >>
> >> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
> >> > DT bindings for numa map for memory, cores and IOs using
> >> > arm,associativity device node property.
> >>
> >> Given this is just a copy of ibm,associativity, I'm not sure I see much
> >> point in renaming the properties.
> >>
> >> However, (somewhat counter to that) I'm also concerned that this isn't
> >> sufficient for systems we're beginning to see today (more on that
> >> below), so I don't think a simple copy of ibm,associativity is good
> >> enough.
> >
> > it is just copy right now, however it can evolve when we come across more
> > arm64 numa platforms

Whatever we do I suspect we'll have to evolve it as new platforms
appear. As I mentioned there are contemporary NUMA ARM64 platforms (e.g.
those with CCN) that I don't think we can ignore now given we'll have to
cater for them.

> >> > +==============================================================================
> >> > +2 - arm,associativity
> >> >
> >> > +==============================================================================
> >> > +The mapping is done using arm,associativity device property.
> >> > +this property needs to be present in every device node which needs to
> >> > to be
> >> > +mapped to numa nodes.
> >>
> >> Can't there be some inheritance? e.g. all devices on a bus with an
> >> arm,associativity property being assumed to share that value?
> >
> > yes there is inheritance and respective bus drivers should take care of it,
> > like pci driver does at present.

Ok. 

That seems counter to my initial interpretation of the wording that the
property must be present on device nodes that need to be mapped to NUMA
nodes.

Is there any simple way of describing the set of nodes that need this
property?

> >> > +topology and boundary in the system at which a significant difference
> >> > in
> >> > +performance can be measured between cross-device accesses within
> >> > +a single location and those spanning multiple locations.
> >> > +The first cell always contains the broadest subdivision within the
> >> > system,
> >> > +while the last cell enumerates the individual devices, such as an SMT
> >> > thread
> >> > +of a CPU, or a bus bridge within an SoC".
> >>
> >> While this gives us some hierarchy, this doesn't seem to encode relative
> >> distances at all. That seems like an oversight.
> >
> >
> > distance is computed, will add the details to document.
> > local nodes will have distance as 10(LOCAL_DISTANCE) and every level, the
> > distance multiplies by 2.
> > for example, for level 1 numa topology, distance from local node to remote
> > node will be 20.

This seems arbitrary.

Why not always have this explicitly described?

> >> Additionally, I'm somewhat unclear on how what you'd be expected to
> >> provide for this property in cases like ring or mesh interconnects,
> >> where there isn't a strict hierarchy (see systems with ARM's own CCN, or
> >> Tilera's TILE-Mx), but there is some measure of closeness.
> >
> >
> > IIUC, as per ARMs CCN architecture, all core/clusters are at equal distance
> > of DDR, i dont see any NUMA topology.

The CCN is a ring interconnect, so CPU clusters (henceforth CPUs) can be
connected with differing distances to RAM instances (or devices).

Consider the simplified network below:

  +-------+      +--------+      +-------+
  | CPU 0 |------| DRAM A |------| CPU 1 |
  +-------+      +--------+      +-------+
      |                              |
      |                              |
  +--------+                     +--------+
  | DRAM B |                     | DRAM C |
  +--------+                     +--------+
      |                              |
      |                              |
  +-------+      +--------+      +-------+
  | CPU 2 |------| DRAM D |------| CPU 3 |
  +-------+      +--------+      +-------+

In this case CPUs and DRAMs are spaced evenly on the ring, but the
distance between an arbitrary CPU and DRAM is not uniform.

CPU 0 can access DRAM A or DRAM B with a single hop, but accesses to
DRAM C or DRAM D take three hops.

An access from CPU 0 to DRAM C could contend with accesses from CPU 1 to
DRAM D, as they share hops on the ring.

There is definitely a NUMA topology here, but there's not a strict
hierarchy. I don't see how you would represent this with the proposed
binding.

Likewise for the mesh networks (e.g. that of TILE-Mx)

> > however, if there are 2 SoC connected thorough the CCN, then it is very much
> > similar to cavium topology.
> >
> >> Must all of these have the same length? If so, why not have a
> >> #(whatever)-cells property in the root to describe the expected length?
> >> If not, how are they to be interpreted relative to each other?
> >
> >
> > yes, all are of default size.

Where that size is...?

> > IMHO, there is no need to add cells property.

That might be the case, but it's unclear from the documentation. I don't
see how one would parse / verify values currently.

> >> > +the arm,associativity nodes. The first integer is the most significant
> >> > +NUMA boundary and the following are progressively less significant
> >> > boundaries.
> >> > +There can be more than one level of NUMA.
> >>
> >> I'm not clear on why this is necessary; the arm,associativity property
> >> is already ordered from most significant to least significant per its
> >> description.
> >
> >
> > first entry in arm,associativity-reference-points is used to find which
> > entry in associativity defines node id.
> > also entries in arm,associativity-reference-points defines,
> > how many entries(depth) in associativity can be used to calculate node
> > distance
> > in both level 1 and  multi level(hierarchical) numa topology.

I think this needs a more thorough description; I don't follow the
current one.

> >> Is this only expected at the root of the tree? Can it be re-defined in
> >> sub-nodes?
> >
> > yes it is defined only at the root.

This needs to be stated explicitly.

I see that this being the case, *,associativity-reference-points would
be a more powerful property than the #(whatever)-cells property I
mentioned earlier, but a more thorough description is required.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-09-30 10:53                 ` Mark Rutland
@ 2015-09-30 17:50                   ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-30 17:50 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Benjamin Herrenschmidt,
	Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Rob Herring,
	Leizhen (ThunderTown),
	Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A

Hi Ben,

On Wed, Sep 30, 2015 at 4:23 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
> On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni wrote:
>> (sending again, by mistake it was set to html mode)
>>
>> On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
>> <gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> > Hi Mark,
>> >
>> > I have tried to answer your comments, in the meantime we are waiting for Ben
>> > to share the details.
>> >
>> > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>> >> > DT bindings for numa map for memory, cores and IOs using
>> >> > arm,associativity device node property.
>> >>
>> >> Given this is just a copy of ibm,associativity, I'm not sure I see much
>> >> point in renaming the properties.
>> >>
>> >> However, (somewhat counter to that) I'm also concerned that this isn't
>> >> sufficient for systems we're beginning to see today (more on that
>> >> below), so I don't think a simple copy of ibm,associativity is good
>> >> enough.
>> >
>> > it is just copy right now, however it can evolve when we come across more
>> > arm64 numa platforms
>
> Whatever we do I suspect we'll have to evolve it as new platforms
> appear. As I mentioned there are contemporary NUMA ARM64 platforms (e.g.
> those with CCN) that I don't think we can ignore now given we'll have to
> cater for them.
>
>> >> > +==============================================================================
>> >> > +2 - arm,associativity
>> >> >
>> >> > +==============================================================================
>> >> > +The mapping is done using arm,associativity device property.
>> >> > +this property needs to be present in every device node which needs to
>> >> > to be
>> >> > +mapped to numa nodes.
>> >>
>> >> Can't there be some inheritance? e.g. all devices on a bus with an
>> >> arm,associativity property being assumed to share that value?
>> >
>> > yes there is inheritance and respective bus drivers should take care of it,
>> > like pci driver does at present.
>
> Ok.
>
> That seems counter to my initial interpretation of the wording that the
> property must be present on device nodes that need to be mapped to NUMA
> nodes.
>
> Is there any simple way of describing the set of nodes that need this
> property?
>
>> >> > +topology and boundary in the system at which a significant difference
>> >> > in
>> >> > +performance can be measured between cross-device accesses within
>> >> > +a single location and those spanning multiple locations.
>> >> > +The first cell always contains the broadest subdivision within the
>> >> > system,
>> >> > +while the last cell enumerates the individual devices, such as an SMT
>> >> > thread
>> >> > +of a CPU, or a bus bridge within an SoC".
>> >>
>> >> While this gives us some hierarchy, this doesn't seem to encode relative
>> >> distances at all. That seems like an oversight.
>> >
>> >
>> > distance is computed, will add the details to document.
>> > local nodes will have distance as 10(LOCAL_DISTANCE) and every level, the
>> > distance multiplies by 2.
>> > for example, for level 1 numa topology, distance from local node to remote
>> > node will be 20.
>
> This seems arbitrary.
>
> Why not always have this explicitly described?
>
>> >> Additionally, I'm somewhat unclear on how what you'd be expected to
>> >> provide for this property in cases like ring or mesh interconnects,
>> >> where there isn't a strict hierarchy (see systems with ARM's own CCN, or
>> >> Tilera's TILE-Mx), but there is some measure of closeness.
>> >
>> >
>> > IIUC, as per ARMs CCN architecture, all core/clusters are at equal distance
>> > of DDR, i dont see any NUMA topology.
>
> The CCN is a ring interconnect, so CPU clusters (henceforth CPUs) can be
> connected with differing distances to RAM instances (or devices).
>
> Consider the simplified network below:
>
>   +-------+      +--------+      +-------+
>   | CPU 0 |------| DRAM A |------| CPU 1 |
>   +-------+      +--------+      +-------+
>       |                              |
>       |                              |
>   +--------+                     +--------+
>   | DRAM B |                     | DRAM C |
>   +--------+                     +--------+
>       |                              |
>       |                              |
>   +-------+      +--------+      +-------+
>   | CPU 2 |------| DRAM D |------| CPU 3 |
>   +-------+      +--------+      +-------+
>
> In this case CPUs and DRAMs are spaced evenly on the ring, but the
> distance between an arbitrary CPU and DRAM is not uniform.
>
> CPU 0 can access DRAM A or DRAM B with a single hop, but accesses to
> DRAM C or DRAM D take three hops.
>
> An access from CPU 0 to DRAM C could contend with accesses from CPU 1 to
> DRAM D, as they share hops on the ring.
>
> There is definitely a NUMA topology here, but there's not a strict
> hierarchy. I don't see how you would represent this with the proposed
> binding.
can you please explain, how associativity property will represent this
numa topology?
>
> Likewise for the mesh networks (e.g. that of TILE-Mx)
>
>> > however, if there are 2 SoC connected thorough the CCN, then it is very much
>> > similar to cavium topology.
>> >
>> >> Must all of these have the same length? If so, why not have a
>> >> #(whatever)-cells property in the root to describe the expected length?
>> >> If not, how are they to be interpreted relative to each other?
>> >
>> >
>> > yes, all are of default size.
>
> Where that size is...?
>
>> > IMHO, there is no need to add cells property.
>
> That might be the case, but it's unclear from the documentation. I don't
> see how one would parse / verify values currently.
>
>> >> > +the arm,associativity nodes. The first integer is the most significant
>> >> > +NUMA boundary and the following are progressively less significant
>> >> > boundaries.
>> >> > +There can be more than one level of NUMA.
>> >>
>> >> I'm not clear on why this is necessary; the arm,associativity property
>> >> is already ordered from most significant to least significant per its
>> >> description.
>> >
>> >
>> > first entry in arm,associativity-reference-points is used to find which
>> > entry in associativity defines node id.
>> > also entries in arm,associativity-reference-points defines,
>> > how many entries(depth) in associativity can be used to calculate node
>> > distance
>> > in both level 1 and  multi level(hierarchical) numa topology.
>
> I think this needs a more thorough description; I don't follow the
> current one.
>
>> >> Is this only expected at the root of the tree? Can it be re-defined in
>> >> sub-nodes?
>> >
>> > yes it is defined only at the root.
>
> This needs to be stated explicitly.
>
> I see that this being the case, *,associativity-reference-points would
> be a more powerful property than the #(whatever)-cells property I
> mentioned earlier, but a more thorough description is required.
>
> Thanks,
> Mark.
thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-09-30 17:50                   ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-09-30 17:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ben,

On Wed, Sep 30, 2015 at 4:23 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni wrote:
>> (sending again, by mistake it was set to html mode)
>>
>> On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
>> <gpkulkarni@gmail.com> wrote:
>> > Hi Mark,
>> >
>> > I have tried to answer your comments, in the meantime we are waiting for Ben
>> > to share the details.
>> >
>> > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <mark.rutland@arm.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni wrote:
>> >> > DT bindings for numa map for memory, cores and IOs using
>> >> > arm,associativity device node property.
>> >>
>> >> Given this is just a copy of ibm,associativity, I'm not sure I see much
>> >> point in renaming the properties.
>> >>
>> >> However, (somewhat counter to that) I'm also concerned that this isn't
>> >> sufficient for systems we're beginning to see today (more on that
>> >> below), so I don't think a simple copy of ibm,associativity is good
>> >> enough.
>> >
>> > it is just copy right now, however it can evolve when we come across more
>> > arm64 numa platforms
>
> Whatever we do I suspect we'll have to evolve it as new platforms
> appear. As I mentioned there are contemporary NUMA ARM64 platforms (e.g.
> those with CCN) that I don't think we can ignore now given we'll have to
> cater for them.
>
>> >> > +==============================================================================
>> >> > +2 - arm,associativity
>> >> >
>> >> > +==============================================================================
>> >> > +The mapping is done using arm,associativity device property.
>> >> > +this property needs to be present in every device node which needs to
>> >> > to be
>> >> > +mapped to numa nodes.
>> >>
>> >> Can't there be some inheritance? e.g. all devices on a bus with an
>> >> arm,associativity property being assumed to share that value?
>> >
>> > yes there is inheritance and respective bus drivers should take care of it,
>> > like pci driver does at present.
>
> Ok.
>
> That seems counter to my initial interpretation of the wording that the
> property must be present on device nodes that need to be mapped to NUMA
> nodes.
>
> Is there any simple way of describing the set of nodes that need this
> property?
>
>> >> > +topology and boundary in the system at which a significant difference
>> >> > in
>> >> > +performance can be measured between cross-device accesses within
>> >> > +a single location and those spanning multiple locations.
>> >> > +The first cell always contains the broadest subdivision within the
>> >> > system,
>> >> > +while the last cell enumerates the individual devices, such as an SMT
>> >> > thread
>> >> > +of a CPU, or a bus bridge within an SoC".
>> >>
>> >> While this gives us some hierarchy, this doesn't seem to encode relative
>> >> distances at all. That seems like an oversight.
>> >
>> >
>> > distance is computed, will add the details to document.
>> > local nodes will have distance as 10(LOCAL_DISTANCE) and every level, the
>> > distance multiplies by 2.
>> > for example, for level 1 numa topology, distance from local node to remote
>> > node will be 20.
>
> This seems arbitrary.
>
> Why not always have this explicitly described?
>
>> >> Additionally, I'm somewhat unclear on how what you'd be expected to
>> >> provide for this property in cases like ring or mesh interconnects,
>> >> where there isn't a strict hierarchy (see systems with ARM's own CCN, or
>> >> Tilera's TILE-Mx), but there is some measure of closeness.
>> >
>> >
>> > IIUC, as per ARMs CCN architecture, all core/clusters are at equal distance
>> > of DDR, i dont see any NUMA topology.
>
> The CCN is a ring interconnect, so CPU clusters (henceforth CPUs) can be
> connected with differing distances to RAM instances (or devices).
>
> Consider the simplified network below:
>
>   +-------+      +--------+      +-------+
>   | CPU 0 |------| DRAM A |------| CPU 1 |
>   +-------+      +--------+      +-------+
>       |                              |
>       |                              |
>   +--------+                     +--------+
>   | DRAM B |                     | DRAM C |
>   +--------+                     +--------+
>       |                              |
>       |                              |
>   +-------+      +--------+      +-------+
>   | CPU 2 |------| DRAM D |------| CPU 3 |
>   +-------+      +--------+      +-------+
>
> In this case CPUs and DRAMs are spaced evenly on the ring, but the
> distance between an arbitrary CPU and DRAM is not uniform.
>
> CPU 0 can access DRAM A or DRAM B with a single hop, but accesses to
> DRAM C or DRAM D take three hops.
>
> An access from CPU 0 to DRAM C could contend with accesses from CPU 1 to
> DRAM D, as they share hops on the ring.
>
> There is definitely a NUMA topology here, but there's not a strict
> hierarchy. I don't see how you would represent this with the proposed
> binding.
can you please explain, how associativity property will represent this
numa topology?
>
> Likewise for the mesh networks (e.g. that of TILE-Mx)
>
>> > however, if there are 2 SoC connected thorough the CCN, then it is very much
>> > similar to cavium topology.
>> >
>> >> Must all of these have the same length? If so, why not have a
>> >> #(whatever)-cells property in the root to describe the expected length?
>> >> If not, how are they to be interpreted relative to each other?
>> >
>> >
>> > yes, all are of default size.
>
> Where that size is...?
>
>> > IMHO, there is no need to add cells property.
>
> That might be the case, but it's unclear from the documentation. I don't
> see how one would parse / verify values currently.
>
>> >> > +the arm,associativity nodes. The first integer is the most significant
>> >> > +NUMA boundary and the following are progressively less significant
>> >> > boundaries.
>> >> > +There can be more than one level of NUMA.
>> >>
>> >> I'm not clear on why this is necessary; the arm,associativity property
>> >> is already ordered from most significant to least significant per its
>> >> description.
>> >
>> >
>> > first entry in arm,associativity-reference-points is used to find which
>> > entry in associativity defines node id.
>> > also entries in arm,associativity-reference-points defines,
>> > how many entries(depth) in associativity can be used to calculate node
>> > distance
>> > in both level 1 and  multi level(hierarchical) numa topology.
>
> I think this needs a more thorough description; I don't follow the
> current one.
>
>> >> Is this only expected at the root of the tree? Can it be re-defined in
>> >> sub-nodes?
>> >
>> > yes it is defined only at the root.
>
> This needs to be stated explicitly.
>
> I see that this being the case, *,associativity-reference-points would
> be a more powerful property than the #(whatever)-cells property I
> mentioned earlier, but a more thorough description is required.
>
> Thanks,
> Mark.
thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-09-30 17:50                   ` Ganapatrao Kulkarni
@ 2015-10-01  1:05                       ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-01  1:05 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, Mark Rutland
  Cc: Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Rob Herring,
	Leizhen (ThunderTown),
	Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd

On Wed, 2015-09-30 at 23:20 +0530, Ganapatrao Kulkarni wrote:
> Hi Ben,

Before I dig in more (short on time right now), PAPR (at least a chunk
of it) was released publicly:

https://members.openpowerfoundation.org/document/dl/469

(You don't need to be a member nor to sign up to get it)

Cheers,
Ben.

> On Wed, Sep 30, 2015 at 4:23 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
> wrote:
> > On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni
> > wrote:
> > > (sending again, by mistake it was set to html mode)
> > > 
> > > On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
> > > <gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > > > Hi Mark,
> > > > 
> > > > I have tried to answer your comments, in the meantime we are
> > > > waiting for Ben
> > > > to share the details.
> > > > 
> > > > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <
> > > > mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
> > > > > wrote:
> > > > > > DT bindings for numa map for memory, cores and IOs using
> > > > > > arm,associativity device node property.
> > > > > 
> > > > > Given this is just a copy of ibm,associativity, I'm not sure
> > > > > I see much
> > > > > point in renaming the properties.
> > > > > 
> > > > > However, (somewhat counter to that) I'm also concerned that
> > > > > this isn't
> > > > > sufficient for systems we're beginning to see today (more on
> > > > > that
> > > > > below), so I don't think a simple copy of ibm,associativity
> > > > > is good
> > > > > enough.
> > > > 
> > > > it is just copy right now, however it can evolve when we come
> > > > across more
> > > > arm64 numa platforms
> > 
> > Whatever we do I suspect we'll have to evolve it as new platforms
> > appear. As I mentioned there are contemporary NUMA ARM64 platforms
> > (e.g.
> > those with CCN) that I don't think we can ignore now given we'll
> > have to
> > cater for them.
> > 
> > > > > > +==========================================================
> > > > > > ====================
> > > > > > +2 - arm,associativity
> > > > > > 
> > > > > > +==========================================================
> > > > > > ====================
> > > > > > +The mapping is done using arm,associativity device
> > > > > > property.
> > > > > > +this property needs to be present in every device node
> > > > > > which needs to
> > > > > > to be
> > > > > > +mapped to numa nodes.
> > > > > 
> > > > > Can't there be some inheritance? e.g. all devices on a bus
> > > > > with an
> > > > > arm,associativity property being assumed to share that value?
> > > > 
> > > > yes there is inheritance and respective bus drivers should take
> > > > care of it,
> > > > like pci driver does at present.
> > 
> > Ok.
> > 
> > That seems counter to my initial interpretation of the wording that
> > the
> > property must be present on device nodes that need to be mapped to
> > NUMA
> > nodes.
> > 
> > Is there any simple way of describing the set of nodes that need
> > this
> > property?
> > 
> > > > > > +topology and boundary in the system at which a significant
> > > > > > difference
> > > > > > in
> > > > > > +performance can be measured between cross-device accesses
> > > > > > within
> > > > > > +a single location and those spanning multiple locations.
> > > > > > +The first cell always contains the broadest subdivision
> > > > > > within the
> > > > > > system,
> > > > > > +while the last cell enumerates the individual devices,
> > > > > > such as an SMT
> > > > > > thread
> > > > > > +of a CPU, or a bus bridge within an SoC".
> > > > > 
> > > > > While this gives us some hierarchy, this doesn't seem to
> > > > > encode relative
> > > > > distances at all. That seems like an oversight.
> > > > 
> > > > 
> > > > distance is computed, will add the details to document.
> > > > local nodes will have distance as 10(LOCAL_DISTANCE) and every
> > > > level, the
> > > > distance multiplies by 2.
> > > > for example, for level 1 numa topology, distance from local
> > > > node to remote
> > > > node will be 20.
> > 
> > This seems arbitrary.
> > 
> > Why not always have this explicitly described?
> > 
> > > > > Additionally, I'm somewhat unclear on how what you'd be
> > > > > expected to
> > > > > provide for this property in cases like ring or mesh
> > > > > interconnects,
> > > > > where there isn't a strict hierarchy (see systems with ARM's
> > > > > own CCN, or
> > > > > Tilera's TILE-Mx), but there is some measure of closeness.
> > > > 
> > > > 
> > > > IIUC, as per ARMs CCN architecture, all core/clusters are at
> > > > equal distance
> > > > of DDR, i dont see any NUMA topology.
> > 
> > The CCN is a ring interconnect, so CPU clusters (henceforth CPUs)
> > can be
> > connected with differing distances to RAM instances (or devices).
> > 
> > Consider the simplified network below:
> > 
> >   +-------+      +--------+      +-------+
> >   | CPU 0 |------| DRAM A |------| CPU 1 |
> >   +-------+      +--------+      +-------+
> >       |                              |
> >       |                              |
> >   +--------+                     +--------+
> >   | DRAM B |                     | DRAM C |
> >   +--------+                     +--------+
> >       |                              |
> >       |                              |
> >   +-------+      +--------+      +-------+
> >   | CPU 2 |------| DRAM D |------| CPU 3 |
> >   +-------+      +--------+      +-------+
> > 
> > In this case CPUs and DRAMs are spaced evenly on the ring, but the
> > distance between an arbitrary CPU and DRAM is not uniform.
> > 
> > CPU 0 can access DRAM A or DRAM B with a single hop, but accesses
> > to
> > DRAM C or DRAM D take three hops.
> > 
> > An access from CPU 0 to DRAM C could contend with accesses from CPU
> > 1 to
> > DRAM D, as they share hops on the ring.
> > 
> > There is definitely a NUMA topology here, but there's not a strict
> > hierarchy. I don't see how you would represent this with the
> > proposed
> > binding.
> can you please explain, how associativity property will represent
> this
> numa topology?
> > 
> > Likewise for the mesh networks (e.g. that of TILE-Mx)
> > 
> > > > however, if there are 2 SoC connected thorough the CCN, then it
> > > > is very much
> > > > similar to cavium topology.
> > > > 
> > > > > Must all of these have the same length? If so, why not have a
> > > > > #(whatever)-cells property in the root to describe the
> > > > > expected length?
> > > > > If not, how are they to be interpreted relative to each
> > > > > other?
> > > > 
> > > > 
> > > > yes, all are of default size.
> > 
> > Where that size is...?
> > 
> > > > IMHO, there is no need to add cells property.
> > 
> > That might be the case, but it's unclear from the documentation. I
> > don't
> > see how one would parse / verify values currently.
> > 
> > > > > > +the arm,associativity nodes. The first integer is the most
> > > > > > significant
> > > > > > +NUMA boundary and the following are progressively less
> > > > > > significant
> > > > > > boundaries.
> > > > > > +There can be more than one level of NUMA.
> > > > > 
> > > > > I'm not clear on why this is necessary; the arm,associativity
> > > > > property
> > > > > is already ordered from most significant to least significant
> > > > > per its
> > > > > description.
> > > > 
> > > > 
> > > > first entry in arm,associativity-reference-points is used to
> > > > find which
> > > > entry in associativity defines node id.
> > > > also entries in arm,associativity-reference-points defines,
> > > > how many entries(depth) in associativity can be used to
> > > > calculate node
> > > > distance
> > > > in both level 1 and  multi level(hierarchical) numa topology.
> > 
> > I think this needs a more thorough description; I don't follow the
> > current one.
> > 
> > > > > Is this only expected at the root of the tree? Can it be re
> > > > > -defined in
> > > > > sub-nodes?
> > > > 
> > > > yes it is defined only at the root.
> > 
> > This needs to be stated explicitly.
> > 
> > I see that this being the case, *,associativity-reference-points
> > would
> > be a more powerful property than the #(whatever)-cells property I
> > mentioned earlier, but a more thorough description is required.
> > 
> > Thanks,
> > Mark.
> thanks
> Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-10-01  1:05                       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-01  1:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2015-09-30 at 23:20 +0530, Ganapatrao Kulkarni wrote:
> Hi Ben,

Before I dig in more (short on time right now), PAPR (at least a chunk
of it) was released publicly:

https://members.openpowerfoundation.org/document/dl/469

(You don't need to be a member nor to sign up to get it)

Cheers,
Ben.

> On Wed, Sep 30, 2015 at 4:23 PM, Mark Rutland <mark.rutland@arm.com>
> wrote:
> > On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni
> > wrote:
> > > (sending again, by mistake it was set to html mode)
> > > 
> > > On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
> > > <gpkulkarni@gmail.com> wrote:
> > > > Hi Mark,
> > > > 
> > > > I have tried to answer your comments, in the meantime we are
> > > > waiting for Ben
> > > > to share the details.
> > > > 
> > > > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <
> > > > mark.rutland at arm.com> wrote:
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
> > > > > wrote:
> > > > > > DT bindings for numa map for memory, cores and IOs using
> > > > > > arm,associativity device node property.
> > > > > 
> > > > > Given this is just a copy of ibm,associativity, I'm not sure
> > > > > I see much
> > > > > point in renaming the properties.
> > > > > 
> > > > > However, (somewhat counter to that) I'm also concerned that
> > > > > this isn't
> > > > > sufficient for systems we're beginning to see today (more on
> > > > > that
> > > > > below), so I don't think a simple copy of ibm,associativity
> > > > > is good
> > > > > enough.
> > > > 
> > > > it is just copy right now, however it can evolve when we come
> > > > across more
> > > > arm64 numa platforms
> > 
> > Whatever we do I suspect we'll have to evolve it as new platforms
> > appear. As I mentioned there are contemporary NUMA ARM64 platforms
> > (e.g.
> > those with CCN) that I don't think we can ignore now given we'll
> > have to
> > cater for them.
> > 
> > > > > > +==========================================================
> > > > > > ====================
> > > > > > +2 - arm,associativity
> > > > > > 
> > > > > > +==========================================================
> > > > > > ====================
> > > > > > +The mapping is done using arm,associativity device
> > > > > > property.
> > > > > > +this property needs to be present in every device node
> > > > > > which needs to
> > > > > > to be
> > > > > > +mapped to numa nodes.
> > > > > 
> > > > > Can't there be some inheritance? e.g. all devices on a bus
> > > > > with an
> > > > > arm,associativity property being assumed to share that value?
> > > > 
> > > > yes there is inheritance and respective bus drivers should take
> > > > care of it,
> > > > like pci driver does at present.
> > 
> > Ok.
> > 
> > That seems counter to my initial interpretation of the wording that
> > the
> > property must be present on device nodes that need to be mapped to
> > NUMA
> > nodes.
> > 
> > Is there any simple way of describing the set of nodes that need
> > this
> > property?
> > 
> > > > > > +topology and boundary in the system at which a significant
> > > > > > difference
> > > > > > in
> > > > > > +performance can be measured between cross-device accesses
> > > > > > within
> > > > > > +a single location and those spanning multiple locations.
> > > > > > +The first cell always contains the broadest subdivision
> > > > > > within the
> > > > > > system,
> > > > > > +while the last cell enumerates the individual devices,
> > > > > > such as an SMT
> > > > > > thread
> > > > > > +of a CPU, or a bus bridge within an SoC".
> > > > > 
> > > > > While this gives us some hierarchy, this doesn't seem to
> > > > > encode relative
> > > > > distances at all. That seems like an oversight.
> > > > 
> > > > 
> > > > distance is computed, will add the details to document.
> > > > local nodes will have distance as 10(LOCAL_DISTANCE) and every
> > > > level, the
> > > > distance multiplies by 2.
> > > > for example, for level 1 numa topology, distance from local
> > > > node to remote
> > > > node will be 20.
> > 
> > This seems arbitrary.
> > 
> > Why not always have this explicitly described?
> > 
> > > > > Additionally, I'm somewhat unclear on how what you'd be
> > > > > expected to
> > > > > provide for this property in cases like ring or mesh
> > > > > interconnects,
> > > > > where there isn't a strict hierarchy (see systems with ARM's
> > > > > own CCN, or
> > > > > Tilera's TILE-Mx), but there is some measure of closeness.
> > > > 
> > > > 
> > > > IIUC, as per ARMs CCN architecture, all core/clusters are at
> > > > equal distance
> > > > of DDR, i dont see any NUMA topology.
> > 
> > The CCN is a ring interconnect, so CPU clusters (henceforth CPUs)
> > can be
> > connected with differing distances to RAM instances (or devices).
> > 
> > Consider the simplified network below:
> > 
> >   +-------+      +--------+      +-------+
> >   | CPU 0 |------| DRAM A |------| CPU 1 |
> >   +-------+      +--------+      +-------+
> >       |                              |
> >       |                              |
> >   +--------+                     +--------+
> >   | DRAM B |                     | DRAM C |
> >   +--------+                     +--------+
> >       |                              |
> >       |                              |
> >   +-------+      +--------+      +-------+
> >   | CPU 2 |------| DRAM D |------| CPU 3 |
> >   +-------+      +--------+      +-------+
> > 
> > In this case CPUs and DRAMs are spaced evenly on the ring, but the
> > distance between an arbitrary CPU and DRAM is not uniform.
> > 
> > CPU 0 can access DRAM A or DRAM B with a single hop, but accesses
> > to
> > DRAM C or DRAM D take three hops.
> > 
> > An access from CPU 0 to DRAM C could contend with accesses from CPU
> > 1 to
> > DRAM D, as they share hops on the ring.
> > 
> > There is definitely a NUMA topology here, but there's not a strict
> > hierarchy. I don't see how you would represent this with the
> > proposed
> > binding.
> can you please explain, how associativity property will represent
> this
> numa topology?
> > 
> > Likewise for the mesh networks (e.g. that of TILE-Mx)
> > 
> > > > however, if there are 2 SoC connected thorough the CCN, then it
> > > > is very much
> > > > similar to cavium topology.
> > > > 
> > > > > Must all of these have the same length? If so, why not have a
> > > > > #(whatever)-cells property in the root to describe the
> > > > > expected length?
> > > > > If not, how are they to be interpreted relative to each
> > > > > other?
> > > > 
> > > > 
> > > > yes, all are of default size.
> > 
> > Where that size is...?
> > 
> > > > IMHO, there is no need to add cells property.
> > 
> > That might be the case, but it's unclear from the documentation. I
> > don't
> > see how one would parse / verify values currently.
> > 
> > > > > > +the arm,associativity nodes. The first integer is the most
> > > > > > significant
> > > > > > +NUMA boundary and the following are progressively less
> > > > > > significant
> > > > > > boundaries.
> > > > > > +There can be more than one level of NUMA.
> > > > > 
> > > > > I'm not clear on why this is necessary; the arm,associativity
> > > > > property
> > > > > is already ordered from most significant to least significant
> > > > > per its
> > > > > description.
> > > > 
> > > > 
> > > > first entry in arm,associativity-reference-points is used to
> > > > find which
> > > > entry in associativity defines node id.
> > > > also entries in arm,associativity-reference-points defines,
> > > > how many entries(depth) in associativity can be used to
> > > > calculate node
> > > > distance
> > > > in both level 1 and  multi level(hierarchical) numa topology.
> > 
> > I think this needs a more thorough description; I don't follow the
> > current one.
> > 
> > > > > Is this only expected at the root of the tree? Can it be re
> > > > > -defined in
> > > > > sub-nodes?
> > > > 
> > > > yes it is defined only at the root.
> > 
> > This needs to be stated explicitly.
> > 
> > I see that this being the case, *,associativity-reference-points
> > would
> > be a more powerful property than the #(whatever)-cells property I
> > mentioned earlier, but a more thorough description is required.
> > 
> > Thanks,
> > Mark.
> thanks
> Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-10-01  1:05                       ` Benjamin Herrenschmidt
  (?)
@ 2015-10-01  5:11                       ` Ganapatrao Kulkarni
       [not found]                         ` <CAFpQJXXKcwks0iZN+3B=U0-9uYKFpAXcZE90GCHN9WyM45Hdpw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  -1 siblings, 1 reply; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-01  5:11 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Mark Rutland
  Cc: Prasun.Kapoor, Rob Herring, Leizhen (ThunderTown),
	Ganapatrao Kulkarni, linux-arm-kernel, devicetree, Will Deacon,
	Catalin Marinas, grant.likely, leif.lindholm, rfranz,
	ard.biesheuvel, msalter, robh+dt, steve.capper, hanjun.guo,
	al.stone, arnd, Pawel Moll, ijc+devicetree, galak, Marc Zyngier

[-- Attachment #1: Type: text/plain, Size: 12618 bytes --]

Hi Ben,


On Thu, Oct 1, 2015 at 6:35 AM, Benjamin Herrenschmidt <
benh@kernel.crashing.org> wrote:

> On Wed, 2015-09-30 at 23:20 +0530, Ganapatrao Kulkarni wrote:
> > Hi Ben,
>
> Before I dig in more (short on time right now), PAPR (at least a chunk
> of it) was released publicly:
>
> https://members.openpowerfoundation.org/document/dl/469
>
thanks a lot for sharing this document.
i went through the chapter 15 of this doc which explains an example on
hierarchical numa topology.
i still could not represent the ring/mesh numa topology using
associativity, which will be present in other upcoming arm64 platforms.


> (You don't need to be a member nor to sign up to get it)
>
> Cheers,
> Ben.
>
> > On Wed, Sep 30, 2015 at 4:23 PM, Mark Rutland <mark.rutland@arm.com>
> > wrote:
> > > On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni
> > > wrote:
> > > > (sending again, by mistake it was set to html mode)
> > > >
> > > > On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
> > > > <gpkulkarni@gmail.com> wrote:
> > > > > Hi Mark,
> > > > >
> > > > > I have tried to answer your comments, in the meantime we are
> > > > > waiting for Ben
> > > > > to share the details.
> > > > >
> > > > > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <
> > > > > mark.rutland@arm.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
> > > > > > wrote:
> > > > > > > DT bindings for numa map for memory, cores and IOs using
> > > > > > > arm,associativity device node property.
> > > > > >
> > > > > > Given this is just a copy of ibm,associativity, I'm not sure
> > > > > > I see much
> > > > > > point in renaming the properties.
> > > > > >
> > > > > > However, (somewhat counter to that) I'm also concerned that
> > > > > > this isn't
> > > > > > sufficient for systems we're beginning to see today (more on
> > > > > > that
> > > > > > below), so I don't think a simple copy of ibm,associativity
> > > > > > is good
> > > > > > enough.
> > > > >
> > > > > it is just copy right now, however it can evolve when we come
> > > > > across more
> > > > > arm64 numa platforms
> > >
> > > Whatever we do I suspect we'll have to evolve it as new platforms
> > > appear. As I mentioned there are contemporary NUMA ARM64 platforms
> > > (e.g.
> > > those with CCN) that I don't think we can ignore now given we'll
> > > have to
> > > cater for them.
> > >
> > > > > > > +==========================================================
> > > > > > > ====================
> > > > > > > +2 - arm,associativity
> > > > > > >
> > > > > > > +==========================================================
> > > > > > > ====================
> > > > > > > +The mapping is done using arm,associativity device
> > > > > > > property.
> > > > > > > +this property needs to be present in every device node
> > > > > > > which needs to
> > > > > > > to be
> > > > > > > +mapped to numa nodes.
> > > > > >
> > > > > > Can't there be some inheritance? e.g. all devices on a bus
> > > > > > with an
> > > > > > arm,associativity property being assumed to share that value?
> > > > >
> > > > > yes there is inheritance and respective bus drivers should take
> > > > > care of it,
> > > > > like pci driver does at present.
> > >
> > > Ok.
> > >
> > > That seems counter to my initial interpretation of the wording that
> > > the
> > > property must be present on device nodes that need to be mapped to
> > > NUMA
> > > nodes.
> > >
> > > Is there any simple way of describing the set of nodes that need
> > > this
> > > property?
> > >
> > > > > > > +topology and boundary in the system at which a significant
> > > > > > > difference
> > > > > > > in
> > > > > > > +performance can be measured between cross-device accesses
> > > > > > > within
> > > > > > > +a single location and those spanning multiple locations.
> > > > > > > +The first cell always contains the broadest subdivision
> > > > > > > within the
> > > > > > > system,
> > > > > > > +while the last cell enumerates the individual devices,
> > > > > > > such as an SMT
> > > > > > > thread
> > > > > > > +of a CPU, or a bus bridge within an SoC".
> > > > > >
> > > > > > While this gives us some hierarchy, this doesn't seem to
> > > > > > encode relative
> > > > > > distances at all. That seems like an oversight.
> > > > >
> > > > >
> > > > > distance is computed, will add the details to document.
> > > > > local nodes will have distance as 10(LOCAL_DISTANCE) and every
> > > > > level, the
> > > > > distance multiplies by 2.
> > > > > for example, for level 1 numa topology, distance from local
> > > > > node to remote
> > > > > node will be 20.
> > >
> > > This seems arbitrary.
> > >
> > > Why not always have this explicitly described?
> > >
> > > > > > Additionally, I'm somewhat unclear on how what you'd be
> > > > > > expected to
> > > > > > provide for this property in cases like ring or mesh
> > > > > > interconnects,
> > > > > > where there isn't a strict hierarchy (see systems with ARM's
> > > > > > own CCN, or
> > > > > > Tilera's TILE-Mx), but there is some measure of closeness.
> > > > >
> > > > >
> > > > > IIUC, as per ARMs CCN architecture, all core/clusters are at
> > > > > equal distance
> > > > > of DDR, i dont see any NUMA topology.
> > >
> > > The CCN is a ring interconnect, so CPU clusters (henceforth CPUs)
> > > can be
> > > connected with differing distances to RAM instances (or devices).
> > >
> > > Consider the simplified network below:
> > >
> > >   +-------+      +--------+      +-------+
> > >   | CPU 0 |------| DRAM A |------| CPU 1 |
> > >   +-------+      +--------+      +-------+
> > >       |                              |
> > >       |                              |
> > >   +--------+                     +--------+
> > >   | DRAM B |                     | DRAM C |
> > >   +--------+                     +--------+
> > >       |                              |
> > >       |                              |
> > >   +-------+      +--------+      +-------+
> > >   | CPU 2 |------| DRAM D |------| CPU 3 |
> > >   +-------+      +--------+      +-------+
> > >
> > > In this case CPUs and DRAMs are spaced evenly on the ring, but the
> > > distance between an arbitrary CPU and DRAM is not uniform.
> > >
> > > CPU 0 can access DRAM A or DRAM B with a single hop, but accesses
> > > to
> > > DRAM C or DRAM D take three hops.
> > >
> > > An access from CPU 0 to DRAM C could contend with accesses from CPU
> > > 1 to
> > > DRAM D, as they share hops on the ring.
> > >
> > > There is definitely a NUMA topology here, but there's not a strict
> > > hierarchy. I don't see how you would represent this with the
> > > proposed
> > > binding.
> > can you please explain, how associativity property will represent
> > this
> > numa topology?
>
Hi Mark,

i am thinking, if we could not address(or becomes complex)  these
topologies using associativity,
we should think of an alternate binding which suits existing and upcoming
arm64 platforms.
can we think of below numa binding which is inline with ACPI and will
address all sort of topologies!

i am proposing as below,

1. introduce "proximity" node property. this property will be
present in dt nodes like memory, cpu, bus and devices(like associativity
property) and
will tell which numa node(proximity domain) this dt node belongs to.

examples:
               cpu@000 {
                        device_type = "cpu";
                        compatible = "cavium,thunder", "arm,armv8";
                        reg = <0x0 0x000>;
                        enable-method = "psci";
                        proximity = <0>;
                };
               cpu@001 {
                        device_type = "cpu";
                        compatible = "cavium,thunder", "arm,armv8";
                        reg = <0x0 0x001>;
                        enable-method = "psci";
                        proximity = <1>;
                };

       memory@00000000 {
                device_type = "memory";
                reg = <0x0 0x01400000 0x3 0xFEC00000>;
                proximity =<0>;

        };

        memory@10000000000 {
                device_type = "memory";
                reg = <0x100 0x00400000 0x3 0xFFC00000>;
                proximity =<1>;
        };

pcie0@0x8480,00000000 {
                compatible = "cavium,thunder-pcie";
                device_type = "pci";
                msi-parent = <&its>;
                bus-range = <0 255>;
                #size-cells = <2>;
                #address-cells = <3>;
                #stream-id-cells = <1>;
                reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
space */
                ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
0x70 0x00000000>, /* mem ranges */
                         <0x03000000 0x8300 0x00000000 0x8300 0x00000000
0x500 0x00000000>;
               proximity =<0>;
        };


2. Introduce new dt node "proximity-map" which will capture the NxN numa
node distance matrix.

for example,  4 nodes connected in mesh/ring structure as,
A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
to> A(1)

relative distance would be,
      A -> B = 20
      B -> C  = 20
      C -> D = 20
      D -> A = 20
      A -> C = 40
      B -> D = 40

and dt presentation for this distance matrix is :

       proximity-map {
             node-count = <4>;
             distance-matrix = <0 0  10>,
                                <0 1  20>,
                                <0 2  40>,
                                <0 3  20>,
                                <1 0  20>,
                                <1 1  10>,
                                <1 2  20>,
                                <1 3  40>,
                                <2 0  40>,
                                <2 1  20>,
                                <2 2  10>,
                                <2 3  20>,
                                <3 0  20>,
                                <3 1  40>,
                                <3 2  20>,
                                <3 3  10>;
          }

the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
put default value(local distance).
the entries like <1 0> can be optional if <0 1> and <1 0> are of same
distance.


> >
> > > Likewise for the mesh networks (e.g. that of TILE-Mx)
> > >
> > > > > however, if there are 2 SoC connected thorough the CCN, then it
> > > > > is very much
> > > > > similar to cavium topology.
> > > > >
> > > > > > Must all of these have the same length? If so, why not have a
> > > > > > #(whatever)-cells property in the root to describe the
> > > > > > expected length?
> > > > > > If not, how are they to be interpreted relative to each
> > > > > > other?
> > > > >
> > > > >
> > > > > yes, all are of default size.
> > >
> > > Where that size is...?
> > >
> > > > > IMHO, there is no need to add cells property.
> > >
> > > That might be the case, but it's unclear from the documentation. I
> > > don't
> > > see how one would parse / verify values currently.
> > >
> > > > > > > +the arm,associativity nodes. The first integer is the most
> > > > > > > significant
> > > > > > > +NUMA boundary and the following are progressively less
> > > > > > > significant
> > > > > > > boundaries.
> > > > > > > +There can be more than one level of NUMA.
> > > > > >
> > > > > > I'm not clear on why this is necessary; the arm,associativity
> > > > > > property
> > > > > > is already ordered from most significant to least significant
> > > > > > per its
> > > > > > description.
> > > > >
> > > > >
> > > > > first entry in arm,associativity-reference-points is used to
> > > > > find which
> > > > > entry in associativity defines node id.
> > > > > also entries in arm,associativity-reference-points defines,
> > > > > how many entries(depth) in associativity can be used to
> > > > > calculate node
> > > > > distance
> > > > > in both level 1 and  multi level(hierarchical) numa topology.
> > >
> > > I think this needs a more thorough description; I don't follow the
> > > current one.
> > >
> > > > > > Is this only expected at the root of the tree? Can it be re
> > > > > > -defined in
> > > > > > sub-nodes?
> > > > >
> > > > > yes it is defined only at the root.
> > >
> > > This needs to be stated explicitly.
> > >
> > > I see that this being the case, *,associativity-reference-points
> > > would
> > > be a more powerful property than the #(whatever)-cells property I
> > > mentioned earlier, but a more thorough description is required.
> > >
> > > Thanks,
> > > Mark.
> > thanks
> > Ganapat
>

thanks
Ganapat

[-- Attachment #2: Type: text/html, Size: 19067 bytes --]

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-10-01  5:11                       ` Ganapatrao Kulkarni
@ 2015-10-01  5:25                             ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-01  5:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Mark Rutland
  Cc: Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Rob Herring,
	Leizhen (ThunderTown),
	Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd

(sending again, dont know, why plane text mode was unchecked.
apologies for the inconvenience)

On Thu, Oct 1, 2015 at 10:41 AM, Ganapatrao Kulkarni
<gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hi Ben,
>
>
> On Thu, Oct 1, 2015 at 6:35 AM, Benjamin Herrenschmidt
> <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org> wrote:
>>
>> On Wed, 2015-09-30 at 23:20 +0530, Ganapatrao Kulkarni wrote:
>> > Hi Ben,
>>
>> Before I dig in more (short on time right now), PAPR (at least a chunk
>> of it) was released publicly:
>>
>> https://members.openpowerfoundation.org/document/dl/469
>
> thanks a lot for sharing this document.
> i went through the chapter 15 of this doc which explains an example on
> hierarchical numa topology.
> i still could not represent the ring/mesh numa topology using associativity,
> which will be present in other upcoming arm64 platforms.
>
>>
>> (You don't need to be a member nor to sign up to get it)
>>
>> Cheers,
>> Ben.
>>
>> > On Wed, Sep 30, 2015 at 4:23 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
>> > wrote:
>> > > On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni
>> > > wrote:
>> > > > (sending again, by mistake it was set to html mode)
>> > > >
>> > > > On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
>> > > > <gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> > > > > Hi Mark,
>> > > > >
>> > > > > I have tried to answer your comments, in the meantime we are
>> > > > > waiting for Ben
>> > > > > to share the details.
>> > > > >
>> > > > > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <
>> > > > > mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
>> > > > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
>> > > > > > wrote:
>> > > > > > > DT bindings for numa map for memory, cores and IOs using
>> > > > > > > arm,associativity device node property.
>> > > > > >
>> > > > > > Given this is just a copy of ibm,associativity, I'm not sure
>> > > > > > I see much
>> > > > > > point in renaming the properties.
>> > > > > >
>> > > > > > However, (somewhat counter to that) I'm also concerned that
>> > > > > > this isn't
>> > > > > > sufficient for systems we're beginning to see today (more on
>> > > > > > that
>> > > > > > below), so I don't think a simple copy of ibm,associativity
>> > > > > > is good
>> > > > > > enough.
>> > > > >
>> > > > > it is just copy right now, however it can evolve when we come
>> > > > > across more
>> > > > > arm64 numa platforms
>> > >
>> > > Whatever we do I suspect we'll have to evolve it as new platforms
>> > > appear. As I mentioned there are contemporary NUMA ARM64 platforms
>> > > (e.g.
>> > > those with CCN) that I don't think we can ignore now given we'll
>> > > have to
>> > > cater for them.
>> > >
>> > > > > > > +==========================================================
>> > > > > > > ====================
>> > > > > > > +2 - arm,associativity
>> > > > > > >
>> > > > > > > +==========================================================
>> > > > > > > ====================
>> > > > > > > +The mapping is done using arm,associativity device
>> > > > > > > property.
>> > > > > > > +this property needs to be present in every device node
>> > > > > > > which needs to
>> > > > > > > to be
>> > > > > > > +mapped to numa nodes.
>> > > > > >
>> > > > > > Can't there be some inheritance? e.g. all devices on a bus
>> > > > > > with an
>> > > > > > arm,associativity property being assumed to share that value?
>> > > > >
>> > > > > yes there is inheritance and respective bus drivers should take
>> > > > > care of it,
>> > > > > like pci driver does at present.
>> > >
>> > > Ok.
>> > >
>> > > That seems counter to my initial interpretation of the wording that
>> > > the
>> > > property must be present on device nodes that need to be mapped to
>> > > NUMA
>> > > nodes.
>> > >
>> > > Is there any simple way of describing the set of nodes that need
>> > > this
>> > > property?
>> > >
>> > > > > > > +topology and boundary in the system at which a significant
>> > > > > > > difference
>> > > > > > > in
>> > > > > > > +performance can be measured between cross-device accesses
>> > > > > > > within
>> > > > > > > +a single location and those spanning multiple locations.
>> > > > > > > +The first cell always contains the broadest subdivision
>> > > > > > > within the
>> > > > > > > system,
>> > > > > > > +while the last cell enumerates the individual devices,
>> > > > > > > such as an SMT
>> > > > > > > thread
>> > > > > > > +of a CPU, or a bus bridge within an SoC".
>> > > > > >
>> > > > > > While this gives us some hierarchy, this doesn't seem to
>> > > > > > encode relative
>> > > > > > distances at all. That seems like an oversight.
>> > > > >
>> > > > >
>> > > > > distance is computed, will add the details to document.
>> > > > > local nodes will have distance as 10(LOCAL_DISTANCE) and every
>> > > > > level, the
>> > > > > distance multiplies by 2.
>> > > > > for example, for level 1 numa topology, distance from local
>> > > > > node to remote
>> > > > > node will be 20.
>> > >
>> > > This seems arbitrary.
>> > >
>> > > Why not always have this explicitly described?
>> > >
>> > > > > > Additionally, I'm somewhat unclear on how what you'd be
>> > > > > > expected to
>> > > > > > provide for this property in cases like ring or mesh
>> > > > > > interconnects,
>> > > > > > where there isn't a strict hierarchy (see systems with ARM's
>> > > > > > own CCN, or
>> > > > > > Tilera's TILE-Mx), but there is some measure of closeness.
>> > > > >
>> > > > >
>> > > > > IIUC, as per ARMs CCN architecture, all core/clusters are at
>> > > > > equal distance
>> > > > > of DDR, i dont see any NUMA topology.
>> > >
>> > > The CCN is a ring interconnect, so CPU clusters (henceforth CPUs)
>> > > can be
>> > > connected with differing distances to RAM instances (or devices).
>> > >
>> > > Consider the simplified network below:
>> > >
>> > >   +-------+      +--------+      +-------+
>> > >   | CPU 0 |------| DRAM A |------| CPU 1 |
>> > >   +-------+      +--------+      +-------+
>> > >       |                              |
>> > >       |                              |
>> > >   +--------+                     +--------+
>> > >   | DRAM B |                     | DRAM C |
>> > >   +--------+                     +--------+
>> > >       |                              |
>> > >       |                              |
>> > >   +-------+      +--------+      +-------+
>> > >   | CPU 2 |------| DRAM D |------| CPU 3 |
>> > >   +-------+      +--------+      +-------+
>> > >
>> > > In this case CPUs and DRAMs are spaced evenly on the ring, but the
>> > > distance between an arbitrary CPU and DRAM is not uniform.
>> > >
>> > > CPU 0 can access DRAM A or DRAM B with a single hop, but accesses
>> > > to
>> > > DRAM C or DRAM D take three hops.
>> > >
>> > > An access from CPU 0 to DRAM C could contend with accesses from CPU
>> > > 1 to
>> > > DRAM D, as they share hops on the ring.
>> > >
>> > > There is definitely a NUMA topology here, but there's not a strict
>> > > hierarchy. I don't see how you would represent this with the
>> > > proposed
>> > > binding.
>> > can you please explain, how associativity property will represent
>> > this
>> > numa topology?
>
> Hi Mark,
>
> i am thinking, if we could not address(or becomes complex)  these topologies
> using associativity,
> we should think of an alternate binding which suits existing and upcoming
> arm64 platforms.
> can we think of below numa binding which is inline with ACPI and will
> address all sort of topologies!
>
> i am proposing as below,
>
> 1. introduce "proximity" node property. this property will be
> present in dt nodes like memory, cpu, bus and devices(like associativity
> property) and
> will tell which numa node(proximity domain) this dt node belongs to.
>
> examples:
>                cpu@000 {
>                         device_type = "cpu";
>                         compatible = "cavium,thunder", "arm,armv8";
>                         reg = <0x0 0x000>;
>                         enable-method = "psci";
>                         proximity = <0>;
>                 };
>                cpu@001 {
>                         device_type = "cpu";
>                         compatible = "cavium,thunder", "arm,armv8";
>                         reg = <0x0 0x001>;
>                         enable-method = "psci";
>                         proximity = <1>;
>                 };
>
>        memory@00000000 {
>                 device_type = "memory";
>                 reg = <0x0 0x01400000 0x3 0xFEC00000>;
>                 proximity =<0>;
>
>         };
>
>         memory@10000000000 {
>                 device_type = "memory";
>                 reg = <0x100 0x00400000 0x3 0xFFC00000>;
>                 proximity =<1>;
>         };
>
> pcie0@0x8480,00000000 {
>                 compatible = "cavium,thunder-pcie";
>                 device_type = "pci";
>                 msi-parent = <&its>;
>                 bus-range = <0 255>;
>                 #size-cells = <2>;
>                 #address-cells = <3>;
>                 #stream-id-cells = <1>;
>                 reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
> space */
>                 ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
> 0x70 0x00000000>, /* mem ranges */
>                          <0x03000000 0x8300 0x00000000 0x8300 0x00000000
> 0x500 0x00000000>;
>                proximity =<0>;
>         };
>
>
> 2. Introduce new dt node "proximity-map" which will capture the NxN numa
> node distance matrix.
>
> for example,  4 nodes connected in mesh/ring structure as,
> A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
> to> A(1)
>
> relative distance would be,
>       A -> B = 20
>       B -> C  = 20
>       C -> D = 20
>       D -> A = 20
>       A -> C = 40
>       B -> D = 40
>
> and dt presentation for this distance matrix is :
>
>        proximity-map {
>              node-count = <4>;
>              distance-matrix = <0 0  10>,
>                                 <0 1  20>,
>                                 <0 2  40>,
>                                 <0 3  20>,
>                                 <1 0  20>,
>                                 <1 1  10>,
>                                 <1 2  20>,
>                                 <1 3  40>,
>                                 <2 0  40>,
>                                 <2 1  20>,
>                                 <2 2  10>,
>                                 <2 3  20>,
>                                 <3 0  20>,
>                                 <3 1  40>,
>                                 <3 2  20>,
>                                 <3 3  10>;
>           }
>
> the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
> put default value(local distance).
> the entries like <1 0> can be optional if <0 1> and <1 0> are of same
> distance.
>
>
>> > >
>> > > Likewise for the mesh networks (e.g. that of TILE-Mx)
>> > >
>> > > > > however, if there are 2 SoC connected thorough the CCN, then it
>> > > > > is very much
>> > > > > similar to cavium topology.
>> > > > >
>> > > > > > Must all of these have the same length? If so, why not have a
>> > > > > > #(whatever)-cells property in the root to describe the
>> > > > > > expected length?
>> > > > > > If not, how are they to be interpreted relative to each
>> > > > > > other?
>> > > > >
>> > > > >
>> > > > > yes, all are of default size.
>> > >
>> > > Where that size is...?
>> > >
>> > > > > IMHO, there is no need to add cells property.
>> > >
>> > > That might be the case, but it's unclear from the documentation. I
>> > > don't
>> > > see how one would parse / verify values currently.
>> > >
>> > > > > > > +the arm,associativity nodes. The first integer is the most
>> > > > > > > significant
>> > > > > > > +NUMA boundary and the following are progressively less
>> > > > > > > significant
>> > > > > > > boundaries.
>> > > > > > > +There can be more than one level of NUMA.
>> > > > > >
>> > > > > > I'm not clear on why this is necessary; the arm,associativity
>> > > > > > property
>> > > > > > is already ordered from most significant to least significant
>> > > > > > per its
>> > > > > > description.
>> > > > >
>> > > > >
>> > > > > first entry in arm,associativity-reference-points is used to
>> > > > > find which
>> > > > > entry in associativity defines node id.
>> > > > > also entries in arm,associativity-reference-points defines,
>> > > > > how many entries(depth) in associativity can be used to
>> > > > > calculate node
>> > > > > distance
>> > > > > in both level 1 and  multi level(hierarchical) numa topology.
>> > >
>> > > I think this needs a more thorough description; I don't follow the
>> > > current one.
>> > >
>> > > > > > Is this only expected at the root of the tree? Can it be re
>> > > > > > -defined in
>> > > > > > sub-nodes?
>> > > > >
>> > > > > yes it is defined only at the root.
>> > >
>> > > This needs to be stated explicitly.
>> > >
>> > > I see that this being the case, *,associativity-reference-points
>> > > would
>> > > be a more powerful property than the #(whatever)-cells property I
>> > > mentioned earlier, but a more thorough description is required.
>> > >
>> > > Thanks,
>> > > Mark.
>> > thanks
>> > Ganapat
>
>
> thanks
> Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-10-01  5:25                             ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-01  5:25 UTC (permalink / raw)
  To: linux-arm-kernel

(sending again, dont know, why plane text mode was unchecked.
apologies for the inconvenience)

On Thu, Oct 1, 2015 at 10:41 AM, Ganapatrao Kulkarni
<gpkulkarni@gmail.com> wrote:
> Hi Ben,
>
>
> On Thu, Oct 1, 2015 at 6:35 AM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
>>
>> On Wed, 2015-09-30 at 23:20 +0530, Ganapatrao Kulkarni wrote:
>> > Hi Ben,
>>
>> Before I dig in more (short on time right now), PAPR (at least a chunk
>> of it) was released publicly:
>>
>> https://members.openpowerfoundation.org/document/dl/469
>
> thanks a lot for sharing this document.
> i went through the chapter 15 of this doc which explains an example on
> hierarchical numa topology.
> i still could not represent the ring/mesh numa topology using associativity,
> which will be present in other upcoming arm64 platforms.
>
>>
>> (You don't need to be a member nor to sign up to get it)
>>
>> Cheers,
>> Ben.
>>
>> > On Wed, Sep 30, 2015 at 4:23 PM, Mark Rutland <mark.rutland@arm.com>
>> > wrote:
>> > > On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni
>> > > wrote:
>> > > > (sending again, by mistake it was set to html mode)
>> > > >
>> > > > On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
>> > > > <gpkulkarni@gmail.com> wrote:
>> > > > > Hi Mark,
>> > > > >
>> > > > > I have tried to answer your comments, in the meantime we are
>> > > > > waiting for Ben
>> > > > > to share the details.
>> > > > >
>> > > > > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <
>> > > > > mark.rutland at arm.com> wrote:
>> > > > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
>> > > > > > wrote:
>> > > > > > > DT bindings for numa map for memory, cores and IOs using
>> > > > > > > arm,associativity device node property.
>> > > > > >
>> > > > > > Given this is just a copy of ibm,associativity, I'm not sure
>> > > > > > I see much
>> > > > > > point in renaming the properties.
>> > > > > >
>> > > > > > However, (somewhat counter to that) I'm also concerned that
>> > > > > > this isn't
>> > > > > > sufficient for systems we're beginning to see today (more on
>> > > > > > that
>> > > > > > below), so I don't think a simple copy of ibm,associativity
>> > > > > > is good
>> > > > > > enough.
>> > > > >
>> > > > > it is just copy right now, however it can evolve when we come
>> > > > > across more
>> > > > > arm64 numa platforms
>> > >
>> > > Whatever we do I suspect we'll have to evolve it as new platforms
>> > > appear. As I mentioned there are contemporary NUMA ARM64 platforms
>> > > (e.g.
>> > > those with CCN) that I don't think we can ignore now given we'll
>> > > have to
>> > > cater for them.
>> > >
>> > > > > > > +==========================================================
>> > > > > > > ====================
>> > > > > > > +2 - arm,associativity
>> > > > > > >
>> > > > > > > +==========================================================
>> > > > > > > ====================
>> > > > > > > +The mapping is done using arm,associativity device
>> > > > > > > property.
>> > > > > > > +this property needs to be present in every device node
>> > > > > > > which needs to
>> > > > > > > to be
>> > > > > > > +mapped to numa nodes.
>> > > > > >
>> > > > > > Can't there be some inheritance? e.g. all devices on a bus
>> > > > > > with an
>> > > > > > arm,associativity property being assumed to share that value?
>> > > > >
>> > > > > yes there is inheritance and respective bus drivers should take
>> > > > > care of it,
>> > > > > like pci driver does at present.
>> > >
>> > > Ok.
>> > >
>> > > That seems counter to my initial interpretation of the wording that
>> > > the
>> > > property must be present on device nodes that need to be mapped to
>> > > NUMA
>> > > nodes.
>> > >
>> > > Is there any simple way of describing the set of nodes that need
>> > > this
>> > > property?
>> > >
>> > > > > > > +topology and boundary in the system at which a significant
>> > > > > > > difference
>> > > > > > > in
>> > > > > > > +performance can be measured between cross-device accesses
>> > > > > > > within
>> > > > > > > +a single location and those spanning multiple locations.
>> > > > > > > +The first cell always contains the broadest subdivision
>> > > > > > > within the
>> > > > > > > system,
>> > > > > > > +while the last cell enumerates the individual devices,
>> > > > > > > such as an SMT
>> > > > > > > thread
>> > > > > > > +of a CPU, or a bus bridge within an SoC".
>> > > > > >
>> > > > > > While this gives us some hierarchy, this doesn't seem to
>> > > > > > encode relative
>> > > > > > distances at all. That seems like an oversight.
>> > > > >
>> > > > >
>> > > > > distance is computed, will add the details to document.
>> > > > > local nodes will have distance as 10(LOCAL_DISTANCE) and every
>> > > > > level, the
>> > > > > distance multiplies by 2.
>> > > > > for example, for level 1 numa topology, distance from local
>> > > > > node to remote
>> > > > > node will be 20.
>> > >
>> > > This seems arbitrary.
>> > >
>> > > Why not always have this explicitly described?
>> > >
>> > > > > > Additionally, I'm somewhat unclear on how what you'd be
>> > > > > > expected to
>> > > > > > provide for this property in cases like ring or mesh
>> > > > > > interconnects,
>> > > > > > where there isn't a strict hierarchy (see systems with ARM's
>> > > > > > own CCN, or
>> > > > > > Tilera's TILE-Mx), but there is some measure of closeness.
>> > > > >
>> > > > >
>> > > > > IIUC, as per ARMs CCN architecture, all core/clusters are at
>> > > > > equal distance
>> > > > > of DDR, i dont see any NUMA topology.
>> > >
>> > > The CCN is a ring interconnect, so CPU clusters (henceforth CPUs)
>> > > can be
>> > > connected with differing distances to RAM instances (or devices).
>> > >
>> > > Consider the simplified network below:
>> > >
>> > >   +-------+      +--------+      +-------+
>> > >   | CPU 0 |------| DRAM A |------| CPU 1 |
>> > >   +-------+      +--------+      +-------+
>> > >       |                              |
>> > >       |                              |
>> > >   +--------+                     +--------+
>> > >   | DRAM B |                     | DRAM C |
>> > >   +--------+                     +--------+
>> > >       |                              |
>> > >       |                              |
>> > >   +-------+      +--------+      +-------+
>> > >   | CPU 2 |------| DRAM D |------| CPU 3 |
>> > >   +-------+      +--------+      +-------+
>> > >
>> > > In this case CPUs and DRAMs are spaced evenly on the ring, but the
>> > > distance between an arbitrary CPU and DRAM is not uniform.
>> > >
>> > > CPU 0 can access DRAM A or DRAM B with a single hop, but accesses
>> > > to
>> > > DRAM C or DRAM D take three hops.
>> > >
>> > > An access from CPU 0 to DRAM C could contend with accesses from CPU
>> > > 1 to
>> > > DRAM D, as they share hops on the ring.
>> > >
>> > > There is definitely a NUMA topology here, but there's not a strict
>> > > hierarchy. I don't see how you would represent this with the
>> > > proposed
>> > > binding.
>> > can you please explain, how associativity property will represent
>> > this
>> > numa topology?
>
> Hi Mark,
>
> i am thinking, if we could not address(or becomes complex)  these topologies
> using associativity,
> we should think of an alternate binding which suits existing and upcoming
> arm64 platforms.
> can we think of below numa binding which is inline with ACPI and will
> address all sort of topologies!
>
> i am proposing as below,
>
> 1. introduce "proximity" node property. this property will be
> present in dt nodes like memory, cpu, bus and devices(like associativity
> property) and
> will tell which numa node(proximity domain) this dt node belongs to.
>
> examples:
>                cpu at 000 {
>                         device_type = "cpu";
>                         compatible = "cavium,thunder", "arm,armv8";
>                         reg = <0x0 0x000>;
>                         enable-method = "psci";
>                         proximity = <0>;
>                 };
>                cpu at 001 {
>                         device_type = "cpu";
>                         compatible = "cavium,thunder", "arm,armv8";
>                         reg = <0x0 0x001>;
>                         enable-method = "psci";
>                         proximity = <1>;
>                 };
>
>        memory at 00000000 {
>                 device_type = "memory";
>                 reg = <0x0 0x01400000 0x3 0xFEC00000>;
>                 proximity =<0>;
>
>         };
>
>         memory at 10000000000 {
>                 device_type = "memory";
>                 reg = <0x100 0x00400000 0x3 0xFFC00000>;
>                 proximity =<1>;
>         };
>
> pcie0 at 0x8480,00000000 {
>                 compatible = "cavium,thunder-pcie";
>                 device_type = "pci";
>                 msi-parent = <&its>;
>                 bus-range = <0 255>;
>                 #size-cells = <2>;
>                 #address-cells = <3>;
>                 #stream-id-cells = <1>;
>                 reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
> space */
>                 ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
> 0x70 0x00000000>, /* mem ranges */
>                          <0x03000000 0x8300 0x00000000 0x8300 0x00000000
> 0x500 0x00000000>;
>                proximity =<0>;
>         };
>
>
> 2. Introduce new dt node "proximity-map" which will capture the NxN numa
> node distance matrix.
>
> for example,  4 nodes connected in mesh/ring structure as,
> A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
> to> A(1)
>
> relative distance would be,
>       A -> B = 20
>       B -> C  = 20
>       C -> D = 20
>       D -> A = 20
>       A -> C = 40
>       B -> D = 40
>
> and dt presentation for this distance matrix is :
>
>        proximity-map {
>              node-count = <4>;
>              distance-matrix = <0 0  10>,
>                                 <0 1  20>,
>                                 <0 2  40>,
>                                 <0 3  20>,
>                                 <1 0  20>,
>                                 <1 1  10>,
>                                 <1 2  20>,
>                                 <1 3  40>,
>                                 <2 0  40>,
>                                 <2 1  20>,
>                                 <2 2  10>,
>                                 <2 3  20>,
>                                 <3 0  20>,
>                                 <3 1  40>,
>                                 <3 2  20>,
>                                 <3 3  10>;
>           }
>
> the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
> put default value(local distance).
> the entries like <1 0> can be optional if <0 1> and <1 0> are of same
> distance.
>
>
>> > >
>> > > Likewise for the mesh networks (e.g. that of TILE-Mx)
>> > >
>> > > > > however, if there are 2 SoC connected thorough the CCN, then it
>> > > > > is very much
>> > > > > similar to cavium topology.
>> > > > >
>> > > > > > Must all of these have the same length? If so, why not have a
>> > > > > > #(whatever)-cells property in the root to describe the
>> > > > > > expected length?
>> > > > > > If not, how are they to be interpreted relative to each
>> > > > > > other?
>> > > > >
>> > > > >
>> > > > > yes, all are of default size.
>> > >
>> > > Where that size is...?
>> > >
>> > > > > IMHO, there is no need to add cells property.
>> > >
>> > > That might be the case, but it's unclear from the documentation. I
>> > > don't
>> > > see how one would parse / verify values currently.
>> > >
>> > > > > > > +the arm,associativity nodes. The first integer is the most
>> > > > > > > significant
>> > > > > > > +NUMA boundary and the following are progressively less
>> > > > > > > significant
>> > > > > > > boundaries.
>> > > > > > > +There can be more than one level of NUMA.
>> > > > > >
>> > > > > > I'm not clear on why this is necessary; the arm,associativity
>> > > > > > property
>> > > > > > is already ordered from most significant to least significant
>> > > > > > per its
>> > > > > > description.
>> > > > >
>> > > > >
>> > > > > first entry in arm,associativity-reference-points is used to
>> > > > > find which
>> > > > > entry in associativity defines node id.
>> > > > > also entries in arm,associativity-reference-points defines,
>> > > > > how many entries(depth) in associativity can be used to
>> > > > > calculate node
>> > > > > distance
>> > > > > in both level 1 and  multi level(hierarchical) numa topology.
>> > >
>> > > I think this needs a more thorough description; I don't follow the
>> > > current one.
>> > >
>> > > > > > Is this only expected at the root of the tree? Can it be re
>> > > > > > -defined in
>> > > > > > sub-nodes?
>> > > > >
>> > > > > yes it is defined only at the root.
>> > >
>> > > This needs to be stated explicitly.
>> > >
>> > > I see that this being the case, *,associativity-reference-points
>> > > would
>> > > be a more powerful property than the #(whatever)-cells property I
>> > > mentioned earlier, but a more thorough description is required.
>> > >
>> > > Thanks,
>> > > Mark.
>> > thanks
>> > Ganapat
>
>
> thanks
> Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-10-01  5:11                       ` Ganapatrao Kulkarni
@ 2015-10-01  7:17                             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-01  7:17 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, Mark Rutland
  Cc: Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Rob Herring,
	Leizhen (ThunderTown),
	Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd

On Thu, 2015-10-01 at 10:41 +0530, Ganapatrao Kulkarni wrote:
> i still could not represent the ring/mesh numa topology using
> associativity, which will be present in other upcoming arm64
> platforms.

Right. It should be possible to represent it using the multiple list as
a multi-path problem, but it's a bit awkward.

It does look like the representation might not work well for that case

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-10-01  7:17                             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 94+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-01  7:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2015-10-01 at 10:41 +0530, Ganapatrao Kulkarni wrote:
> i still could not represent the ring/mesh numa topology using
> associativity, which will be present in other upcoming arm64
> platforms.

Right. It should be possible to represent it using the multiple list as
a multi-path problem, but it's a bit awkward.

It does look like the representation might not work well for that case

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-10-01  5:11                       ` Ganapatrao Kulkarni
@ 2015-10-01 11:36                             ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-01 11:36 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Rob Herring,
	Leizhen (ThunderTown),
	Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd

Hi Mark,

On Thu, Oct 1, 2015 at 10:41 AM, Ganapatrao Kulkarni
<gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Hi Ben,
>
>
> On Thu, Oct 1, 2015 at 6:35 AM, Benjamin Herrenschmidt
> <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org> wrote:
>>
>> On Wed, 2015-09-30 at 23:20 +0530, Ganapatrao Kulkarni wrote:
>> > Hi Ben,
>>
>> Before I dig in more (short on time right now), PAPR (at least a chunk
>> of it) was released publicly:
>>
>> https://members.openpowerfoundation.org/document/dl/469
>
> thanks a lot for sharing this document.
> i went through the chapter 15 of this doc which explains an example on
> hierarchical numa topology.
> i still could not represent the ring/mesh numa topology using associativity,
> which will be present in other upcoming arm64 platforms.
>
>>
>> (You don't need to be a member nor to sign up to get it)
>>
>> Cheers,
>> Ben.
>>
>> > On Wed, Sep 30, 2015 at 4:23 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
>> > wrote:
>> > > On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni
>> > > wrote:
>> > > > (sending again, by mistake it was set to html mode)
>> > > >
>> > > > On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
>> > > > <gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> > > > > Hi Mark,
>> > > > >
>> > > > > I have tried to answer your comments, in the meantime we are
>> > > > > waiting for Ben
>> > > > > to share the details.
>> > > > >
>> > > > > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <
>> > > > > mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
>> > > > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
>> > > > > > wrote:
>> > > > > > > DT bindings for numa map for memory, cores and IOs using
>> > > > > > > arm,associativity device node property.
>> > > > > >
>> > > > > > Given this is just a copy of ibm,associativity, I'm not sure
>> > > > > > I see much
>> > > > > > point in renaming the properties.
>> > > > > >
>> > > > > > However, (somewhat counter to that) I'm also concerned that
>> > > > > > this isn't
>> > > > > > sufficient for systems we're beginning to see today (more on
>> > > > > > that
>> > > > > > below), so I don't think a simple copy of ibm,associativity
>> > > > > > is good
>> > > > > > enough.
>> > > > >
>> > > > > it is just copy right now, however it can evolve when we come
>> > > > > across more
>> > > > > arm64 numa platforms
>> > >
>> > > Whatever we do I suspect we'll have to evolve it as new platforms
>> > > appear. As I mentioned there are contemporary NUMA ARM64 platforms
>> > > (e.g.
>> > > those with CCN) that I don't think we can ignore now given we'll
>> > > have to
>> > > cater for them.
>> > >
>> > > > > > > +==========================================================
>> > > > > > > ====================
>> > > > > > > +2 - arm,associativity
>> > > > > > >
>> > > > > > > +==========================================================
>> > > > > > > ====================
>> > > > > > > +The mapping is done using arm,associativity device
>> > > > > > > property.
>> > > > > > > +this property needs to be present in every device node
>> > > > > > > which needs to
>> > > > > > > to be
>> > > > > > > +mapped to numa nodes.
>> > > > > >
>> > > > > > Can't there be some inheritance? e.g. all devices on a bus
>> > > > > > with an
>> > > > > > arm,associativity property being assumed to share that value?
>> > > > >
>> > > > > yes there is inheritance and respective bus drivers should take
>> > > > > care of it,
>> > > > > like pci driver does at present.
>> > >
>> > > Ok.
>> > >
>> > > That seems counter to my initial interpretation of the wording that
>> > > the
>> > > property must be present on device nodes that need to be mapped to
>> > > NUMA
>> > > nodes.
>> > >
>> > > Is there any simple way of describing the set of nodes that need
>> > > this
>> > > property?
>> > >
>> > > > > > > +topology and boundary in the system at which a significant
>> > > > > > > difference
>> > > > > > > in
>> > > > > > > +performance can be measured between cross-device accesses
>> > > > > > > within
>> > > > > > > +a single location and those spanning multiple locations.
>> > > > > > > +The first cell always contains the broadest subdivision
>> > > > > > > within the
>> > > > > > > system,
>> > > > > > > +while the last cell enumerates the individual devices,
>> > > > > > > such as an SMT
>> > > > > > > thread
>> > > > > > > +of a CPU, or a bus bridge within an SoC".
>> > > > > >
>> > > > > > While this gives us some hierarchy, this doesn't seem to
>> > > > > > encode relative
>> > > > > > distances at all. That seems like an oversight.
>> > > > >
>> > > > >
>> > > > > distance is computed, will add the details to document.
>> > > > > local nodes will have distance as 10(LOCAL_DISTANCE) and every
>> > > > > level, the
>> > > > > distance multiplies by 2.
>> > > > > for example, for level 1 numa topology, distance from local
>> > > > > node to remote
>> > > > > node will be 20.
>> > >
>> > > This seems arbitrary.
>> > >
>> > > Why not always have this explicitly described?
>> > >
>> > > > > > Additionally, I'm somewhat unclear on how what you'd be
>> > > > > > expected to
>> > > > > > provide for this property in cases like ring or mesh
>> > > > > > interconnects,
>> > > > > > where there isn't a strict hierarchy (see systems with ARM's
>> > > > > > own CCN, or
>> > > > > > Tilera's TILE-Mx), but there is some measure of closeness.
>> > > > >
>> > > > >
>> > > > > IIUC, as per ARMs CCN architecture, all core/clusters are at
>> > > > > equal distance
>> > > > > of DDR, i dont see any NUMA topology.
>> > >
>> > > The CCN is a ring interconnect, so CPU clusters (henceforth CPUs)
>> > > can be
>> > > connected with differing distances to RAM instances (or devices).
>> > >
>> > > Consider the simplified network below:
>> > >
>> > >   +-------+      +--------+      +-------+
>> > >   | CPU 0 |------| DRAM A |------| CPU 1 |
>> > >   +-------+      +--------+      +-------+
>> > >       |                              |
>> > >       |                              |
>> > >   +--------+                     +--------+
>> > >   | DRAM B |                     | DRAM C |
>> > >   +--------+                     +--------+
>> > >       |                              |
>> > >       |                              |
>> > >   +-------+      +--------+      +-------+
>> > >   | CPU 2 |------| DRAM D |------| CPU 3 |
>> > >   +-------+      +--------+      +-------+
>> > >
>> > > In this case CPUs and DRAMs are spaced evenly on the ring, but the
>> > > distance between an arbitrary CPU and DRAM is not uniform.
>> > >
>> > > CPU 0 can access DRAM A or DRAM B with a single hop, but accesses
>> > > to
>> > > DRAM C or DRAM D take three hops.
>> > >
>> > > An access from CPU 0 to DRAM C could contend with accesses from CPU
>> > > 1 to
>> > > DRAM D, as they share hops on the ring.
>> > >
>> > > There is definitely a NUMA topology here, but there's not a strict
>> > > hierarchy. I don't see how you would represent this with the
>> > > proposed
>> > > binding.
>> > can you please explain, how associativity property will represent
>> > this
>> > numa topology?
>
> Hi Mark,
>
> i am thinking, if we could not address(or becomes complex)  these topologies
> using associativity,
> we should think of an alternate binding which suits existing and upcoming
> arm64 platforms.
> can we think of below numa binding which is inline with ACPI and will
> address all sort of topologies!
>
> i am proposing as below,
>
> 1. introduce "proximity" node property. this property will be
> present in dt nodes like memory, cpu, bus and devices(like associativity
> property) and
> will tell which numa node(proximity domain) this dt node belongs to.
>
> examples:
>                cpu@000 {
>                         device_type = "cpu";
>                         compatible = "cavium,thunder", "arm,armv8";
>                         reg = <0x0 0x000>;
>                         enable-method = "psci";
>                         proximity = <0>;
>                 };
>                cpu@001 {
>                         device_type = "cpu";
>                         compatible = "cavium,thunder", "arm,armv8";
>                         reg = <0x0 0x001>;
>                         enable-method = "psci";
>                         proximity = <1>;
>                 };
>
>        memory@00000000 {
>                 device_type = "memory";
>                 reg = <0x0 0x01400000 0x3 0xFEC00000>;
>                 proximity =<0>;
>
>         };
>
>         memory@10000000000 {
>                 device_type = "memory";
>                 reg = <0x100 0x00400000 0x3 0xFFC00000>;
>                 proximity =<1>;
>         };
>
> pcie0@0x8480,00000000 {
>                 compatible = "cavium,thunder-pcie";
>                 device_type = "pci";
>                 msi-parent = <&its>;
>                 bus-range = <0 255>;
>                 #size-cells = <2>;
>                 #address-cells = <3>;
>                 #stream-id-cells = <1>;
>                 reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
> space */
>                 ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
> 0x70 0x00000000>, /* mem ranges */
>                          <0x03000000 0x8300 0x00000000 0x8300 0x00000000
> 0x500 0x00000000>;
>                proximity =<0>;
>         };
>
>
> 2. Introduce new dt node "proximity-map" which will capture the NxN numa
> node distance matrix.
>
> for example,  4 nodes connected in mesh/ring structure as,
> A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
> to> A(1)
>
> relative distance would be,
>       A -> B = 20
>       B -> C  = 20
>       C -> D = 20
>       D -> A = 20
>       A -> C = 40
>       B -> D = 40
>
> and dt presentation for this distance matrix is :
>
>        proximity-map {
>              node-count = <4>;
>              distance-matrix = <0 0  10>,
>                                 <0 1  20>,
>                                 <0 2  40>,
>                                 <0 3  20>,
>                                 <1 0  20>,
>                                 <1 1  10>,
>                                 <1 2  20>,
>                                 <1 3  40>,
>                                 <2 0  40>,
>                                 <2 1  20>,
>                                 <2 2  10>,
>                                 <2 3  20>,
>                                 <3 0  20>,
>                                 <3 1  40>,
>                                 <3 2  20>,
>                                 <3 3  10>;
>           }
>
> the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
> put default value(local distance).
> the entries like <1 0> can be optional if <0 1> and <1 0> are of same
> distance.
is this binding looks ok?
i can implement this and submit in next version of patchset.
>
>
>> > >
>> > > Likewise for the mesh networks (e.g. that of TILE-Mx)
>> > >
>> > > > > however, if there are 2 SoC connected thorough the CCN, then it
>> > > > > is very much
>> > > > > similar to cavium topology.
>> > > > >
>> > > > > > Must all of these have the same length? If so, why not have a
>> > > > > > #(whatever)-cells property in the root to describe the
>> > > > > > expected length?
>> > > > > > If not, how are they to be interpreted relative to each
>> > > > > > other?
>> > > > >
>> > > > >
>> > > > > yes, all are of default size.
>> > >
>> > > Where that size is...?
>> > >
>> > > > > IMHO, there is no need to add cells property.
>> > >
>> > > That might be the case, but it's unclear from the documentation. I
>> > > don't
>> > > see how one would parse / verify values currently.
>> > >
>> > > > > > > +the arm,associativity nodes. The first integer is the most
>> > > > > > > significant
>> > > > > > > +NUMA boundary and the following are progressively less
>> > > > > > > significant
>> > > > > > > boundaries.
>> > > > > > > +There can be more than one level of NUMA.
>> > > > > >
>> > > > > > I'm not clear on why this is necessary; the arm,associativity
>> > > > > > property
>> > > > > > is already ordered from most significant to least significant
>> > > > > > per its
>> > > > > > description.
>> > > > >
>> > > > >
>> > > > > first entry in arm,associativity-reference-points is used to
>> > > > > find which
>> > > > > entry in associativity defines node id.
>> > > > > also entries in arm,associativity-reference-points defines,
>> > > > > how many entries(depth) in associativity can be used to
>> > > > > calculate node
>> > > > > distance
>> > > > > in both level 1 and  multi level(hierarchical) numa topology.
>> > >
>> > > I think this needs a more thorough description; I don't follow the
>> > > current one.
>> > >
>> > > > > > Is this only expected at the root of the tree? Can it be re
>> > > > > > -defined in
>> > > > > > sub-nodes?
>> > > > >
>> > > > > yes it is defined only at the root.
>> > >
>> > > This needs to be stated explicitly.
>> > >
>> > > I see that this being the case, *,associativity-reference-points
>> > > would
>> > > be a more powerful property than the #(whatever)-cells property I
>> > > mentioned earlier, but a more thorough description is required.
>> > >
>> > > Thanks,
>> > > Mark.
>> > thanks
>> > Ganapat
>
>
> thanks
> Ganapat
thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-10-01 11:36                             ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-01 11:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On Thu, Oct 1, 2015 at 10:41 AM, Ganapatrao Kulkarni
<gpkulkarni@gmail.com> wrote:
> Hi Ben,
>
>
> On Thu, Oct 1, 2015 at 6:35 AM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
>>
>> On Wed, 2015-09-30 at 23:20 +0530, Ganapatrao Kulkarni wrote:
>> > Hi Ben,
>>
>> Before I dig in more (short on time right now), PAPR (at least a chunk
>> of it) was released publicly:
>>
>> https://members.openpowerfoundation.org/document/dl/469
>
> thanks a lot for sharing this document.
> i went through the chapter 15 of this doc which explains an example on
> hierarchical numa topology.
> i still could not represent the ring/mesh numa topology using associativity,
> which will be present in other upcoming arm64 platforms.
>
>>
>> (You don't need to be a member nor to sign up to get it)
>>
>> Cheers,
>> Ben.
>>
>> > On Wed, Sep 30, 2015 at 4:23 PM, Mark Rutland <mark.rutland@arm.com>
>> > wrote:
>> > > On Tue, Sep 29, 2015 at 09:38:04AM +0100, Ganapatrao Kulkarni
>> > > wrote:
>> > > > (sending again, by mistake it was set to html mode)
>> > > >
>> > > > On Tue, Sep 29, 2015 at 2:05 PM, Ganapatrao Kulkarni
>> > > > <gpkulkarni@gmail.com> wrote:
>> > > > > Hi Mark,
>> > > > >
>> > > > > I have tried to answer your comments, in the meantime we are
>> > > > > waiting for Ben
>> > > > > to share the details.
>> > > > >
>> > > > > On Fri, Aug 28, 2015 at 6:02 PM, Mark Rutland <
>> > > > > mark.rutland at arm.com> wrote:
>> > > > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > > On Fri, Aug 14, 2015 at 05:39:32PM +0100, Ganapatrao Kulkarni
>> > > > > > wrote:
>> > > > > > > DT bindings for numa map for memory, cores and IOs using
>> > > > > > > arm,associativity device node property.
>> > > > > >
>> > > > > > Given this is just a copy of ibm,associativity, I'm not sure
>> > > > > > I see much
>> > > > > > point in renaming the properties.
>> > > > > >
>> > > > > > However, (somewhat counter to that) I'm also concerned that
>> > > > > > this isn't
>> > > > > > sufficient for systems we're beginning to see today (more on
>> > > > > > that
>> > > > > > below), so I don't think a simple copy of ibm,associativity
>> > > > > > is good
>> > > > > > enough.
>> > > > >
>> > > > > it is just copy right now, however it can evolve when we come
>> > > > > across more
>> > > > > arm64 numa platforms
>> > >
>> > > Whatever we do I suspect we'll have to evolve it as new platforms
>> > > appear. As I mentioned there are contemporary NUMA ARM64 platforms
>> > > (e.g.
>> > > those with CCN) that I don't think we can ignore now given we'll
>> > > have to
>> > > cater for them.
>> > >
>> > > > > > > +==========================================================
>> > > > > > > ====================
>> > > > > > > +2 - arm,associativity
>> > > > > > >
>> > > > > > > +==========================================================
>> > > > > > > ====================
>> > > > > > > +The mapping is done using arm,associativity device
>> > > > > > > property.
>> > > > > > > +this property needs to be present in every device node
>> > > > > > > which needs to
>> > > > > > > to be
>> > > > > > > +mapped to numa nodes.
>> > > > > >
>> > > > > > Can't there be some inheritance? e.g. all devices on a bus
>> > > > > > with an
>> > > > > > arm,associativity property being assumed to share that value?
>> > > > >
>> > > > > yes there is inheritance and respective bus drivers should take
>> > > > > care of it,
>> > > > > like pci driver does at present.
>> > >
>> > > Ok.
>> > >
>> > > That seems counter to my initial interpretation of the wording that
>> > > the
>> > > property must be present on device nodes that need to be mapped to
>> > > NUMA
>> > > nodes.
>> > >
>> > > Is there any simple way of describing the set of nodes that need
>> > > this
>> > > property?
>> > >
>> > > > > > > +topology and boundary in the system at which a significant
>> > > > > > > difference
>> > > > > > > in
>> > > > > > > +performance can be measured between cross-device accesses
>> > > > > > > within
>> > > > > > > +a single location and those spanning multiple locations.
>> > > > > > > +The first cell always contains the broadest subdivision
>> > > > > > > within the
>> > > > > > > system,
>> > > > > > > +while the last cell enumerates the individual devices,
>> > > > > > > such as an SMT
>> > > > > > > thread
>> > > > > > > +of a CPU, or a bus bridge within an SoC".
>> > > > > >
>> > > > > > While this gives us some hierarchy, this doesn't seem to
>> > > > > > encode relative
>> > > > > > distances at all. That seems like an oversight.
>> > > > >
>> > > > >
>> > > > > distance is computed, will add the details to document.
>> > > > > local nodes will have distance as 10(LOCAL_DISTANCE) and every
>> > > > > level, the
>> > > > > distance multiplies by 2.
>> > > > > for example, for level 1 numa topology, distance from local
>> > > > > node to remote
>> > > > > node will be 20.
>> > >
>> > > This seems arbitrary.
>> > >
>> > > Why not always have this explicitly described?
>> > >
>> > > > > > Additionally, I'm somewhat unclear on how what you'd be
>> > > > > > expected to
>> > > > > > provide for this property in cases like ring or mesh
>> > > > > > interconnects,
>> > > > > > where there isn't a strict hierarchy (see systems with ARM's
>> > > > > > own CCN, or
>> > > > > > Tilera's TILE-Mx), but there is some measure of closeness.
>> > > > >
>> > > > >
>> > > > > IIUC, as per ARMs CCN architecture, all core/clusters are at
>> > > > > equal distance
>> > > > > of DDR, i dont see any NUMA topology.
>> > >
>> > > The CCN is a ring interconnect, so CPU clusters (henceforth CPUs)
>> > > can be
>> > > connected with differing distances to RAM instances (or devices).
>> > >
>> > > Consider the simplified network below:
>> > >
>> > >   +-------+      +--------+      +-------+
>> > >   | CPU 0 |------| DRAM A |------| CPU 1 |
>> > >   +-------+      +--------+      +-------+
>> > >       |                              |
>> > >       |                              |
>> > >   +--------+                     +--------+
>> > >   | DRAM B |                     | DRAM C |
>> > >   +--------+                     +--------+
>> > >       |                              |
>> > >       |                              |
>> > >   +-------+      +--------+      +-------+
>> > >   | CPU 2 |------| DRAM D |------| CPU 3 |
>> > >   +-------+      +--------+      +-------+
>> > >
>> > > In this case CPUs and DRAMs are spaced evenly on the ring, but the
>> > > distance between an arbitrary CPU and DRAM is not uniform.
>> > >
>> > > CPU 0 can access DRAM A or DRAM B with a single hop, but accesses
>> > > to
>> > > DRAM C or DRAM D take three hops.
>> > >
>> > > An access from CPU 0 to DRAM C could contend with accesses from CPU
>> > > 1 to
>> > > DRAM D, as they share hops on the ring.
>> > >
>> > > There is definitely a NUMA topology here, but there's not a strict
>> > > hierarchy. I don't see how you would represent this with the
>> > > proposed
>> > > binding.
>> > can you please explain, how associativity property will represent
>> > this
>> > numa topology?
>
> Hi Mark,
>
> i am thinking, if we could not address(or becomes complex)  these topologies
> using associativity,
> we should think of an alternate binding which suits existing and upcoming
> arm64 platforms.
> can we think of below numa binding which is inline with ACPI and will
> address all sort of topologies!
>
> i am proposing as below,
>
> 1. introduce "proximity" node property. this property will be
> present in dt nodes like memory, cpu, bus and devices(like associativity
> property) and
> will tell which numa node(proximity domain) this dt node belongs to.
>
> examples:
>                cpu at 000 {
>                         device_type = "cpu";
>                         compatible = "cavium,thunder", "arm,armv8";
>                         reg = <0x0 0x000>;
>                         enable-method = "psci";
>                         proximity = <0>;
>                 };
>                cpu at 001 {
>                         device_type = "cpu";
>                         compatible = "cavium,thunder", "arm,armv8";
>                         reg = <0x0 0x001>;
>                         enable-method = "psci";
>                         proximity = <1>;
>                 };
>
>        memory at 00000000 {
>                 device_type = "memory";
>                 reg = <0x0 0x01400000 0x3 0xFEC00000>;
>                 proximity =<0>;
>
>         };
>
>         memory at 10000000000 {
>                 device_type = "memory";
>                 reg = <0x100 0x00400000 0x3 0xFFC00000>;
>                 proximity =<1>;
>         };
>
> pcie0 at 0x8480,00000000 {
>                 compatible = "cavium,thunder-pcie";
>                 device_type = "pci";
>                 msi-parent = <&its>;
>                 bus-range = <0 255>;
>                 #size-cells = <2>;
>                 #address-cells = <3>;
>                 #stream-id-cells = <1>;
>                 reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
> space */
>                 ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
> 0x70 0x00000000>, /* mem ranges */
>                          <0x03000000 0x8300 0x00000000 0x8300 0x00000000
> 0x500 0x00000000>;
>                proximity =<0>;
>         };
>
>
> 2. Introduce new dt node "proximity-map" which will capture the NxN numa
> node distance matrix.
>
> for example,  4 nodes connected in mesh/ring structure as,
> A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
> to> A(1)
>
> relative distance would be,
>       A -> B = 20
>       B -> C  = 20
>       C -> D = 20
>       D -> A = 20
>       A -> C = 40
>       B -> D = 40
>
> and dt presentation for this distance matrix is :
>
>        proximity-map {
>              node-count = <4>;
>              distance-matrix = <0 0  10>,
>                                 <0 1  20>,
>                                 <0 2  40>,
>                                 <0 3  20>,
>                                 <1 0  20>,
>                                 <1 1  10>,
>                                 <1 2  20>,
>                                 <1 3  40>,
>                                 <2 0  40>,
>                                 <2 1  20>,
>                                 <2 2  10>,
>                                 <2 3  20>,
>                                 <3 0  20>,
>                                 <3 1  40>,
>                                 <3 2  20>,
>                                 <3 3  10>;
>           }
>
> the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
> put default value(local distance).
> the entries like <1 0> can be optional if <0 1> and <1 0> are of same
> distance.
is this binding looks ok?
i can implement this and submit in next version of patchset.
>
>
>> > >
>> > > Likewise for the mesh networks (e.g. that of TILE-Mx)
>> > >
>> > > > > however, if there are 2 SoC connected thorough the CCN, then it
>> > > > > is very much
>> > > > > similar to cavium topology.
>> > > > >
>> > > > > > Must all of these have the same length? If so, why not have a
>> > > > > > #(whatever)-cells property in the root to describe the
>> > > > > > expected length?
>> > > > > > If not, how are they to be interpreted relative to each
>> > > > > > other?
>> > > > >
>> > > > >
>> > > > > yes, all are of default size.
>> > >
>> > > Where that size is...?
>> > >
>> > > > > IMHO, there is no need to add cells property.
>> > >
>> > > That might be the case, but it's unclear from the documentation. I
>> > > don't
>> > > see how one would parse / verify values currently.
>> > >
>> > > > > > > +the arm,associativity nodes. The first integer is the most
>> > > > > > > significant
>> > > > > > > +NUMA boundary and the following are progressively less
>> > > > > > > significant
>> > > > > > > boundaries.
>> > > > > > > +There can be more than one level of NUMA.
>> > > > > >
>> > > > > > I'm not clear on why this is necessary; the arm,associativity
>> > > > > > property
>> > > > > > is already ordered from most significant to least significant
>> > > > > > per its
>> > > > > > description.
>> > > > >
>> > > > >
>> > > > > first entry in arm,associativity-reference-points is used to
>> > > > > find which
>> > > > > entry in associativity defines node id.
>> > > > > also entries in arm,associativity-reference-points defines,
>> > > > > how many entries(depth) in associativity can be used to
>> > > > > calculate node
>> > > > > distance
>> > > > > in both level 1 and  multi level(hierarchical) numa topology.
>> > >
>> > > I think this needs a more thorough description; I don't follow the
>> > > current one.
>> > >
>> > > > > > Is this only expected at the root of the tree? Can it be re
>> > > > > > -defined in
>> > > > > > sub-nodes?
>> > > > >
>> > > > > yes it is defined only at the root.
>> > >
>> > > This needs to be stated explicitly.
>> > >
>> > > I see that this being the case, *,associativity-reference-points
>> > > would
>> > > be a more powerful property than the #(whatever)-cells property I
>> > > mentioned earlier, but a more thorough description is required.
>> > >
>> > > Thanks,
>> > > Mark.
>> > thanks
>> > Ganapat
>
>
> thanks
> Ganapat
thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 1/4] arm64, numa: adding numa support for arm64 platforms.
  2015-08-14 16:39   ` Ganapatrao Kulkarni
@ 2015-10-05  5:24       ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-05  5:24 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, Will Deacon, Catalin Marinas
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Grant Likely, Leif Lindholm,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	Hanjun Guo, Al Stone, Arnd Bergmann, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, Prasun Kapoor, Robert Richter

Hi Catalin,

can you please review this patch.

On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
<gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org> wrote:
> Adding numa support for arm64 based platforms.
> This patch adds by default the dummy numa node and
> maps all memory and cpus to node 0.
> using this patch, numa can be simulated on single node arm64 platforms.
>
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
> ---
>  arch/arm64/Kconfig              |  26 ++
>  arch/arm64/include/asm/mmzone.h |  32 +++
>  arch/arm64/include/asm/numa.h   |  42 +++
>  arch/arm64/kernel/setup.c       |   9 +
>  arch/arm64/kernel/smp.c         |   2 +
>  arch/arm64/mm/Makefile          |   1 +
>  arch/arm64/mm/init.c            |  34 ++-
>  arch/arm64/mm/numa.c            | 550 ++++++++++++++++++++++++++++++++++++++++
>  8 files changed, 690 insertions(+), 6 deletions(-)
>  create mode 100644 arch/arm64/include/asm/mmzone.h
>  create mode 100644 arch/arm64/include/asm/numa.h
>  create mode 100644 arch/arm64/mm/numa.c
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 43a0c26..fa37a5d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -72,6 +72,7 @@ config ARM64
>         select HAVE_PERF_USER_STACK_DUMP
>         select HAVE_RCU_TABLE_FREE
>         select HAVE_SYSCALL_TRACEPOINTS
> +       select HAVE_MEMBLOCK_NODE_MAP if NUMA
>         select IRQ_DOMAIN
>         select IRQ_FORCED_THREADING
>         select MODULES_USE_ELF_RELA
> @@ -559,6 +560,31 @@ config HOTPLUG_CPU
>           Say Y here to experiment with turning CPUs off and on.  CPUs
>           can be controlled through /sys/devices/system/cpu.
>
> +# Common NUMA Features
> +config NUMA
> +       bool "Numa Memory Allocation and Scheduler Support"
> +       depends on SMP
> +       help
> +         Enable NUMA (Non Uniform Memory Access) support.
> +
> +         The kernel will try to allocate memory used by a CPU on the
> +         local memory controller of the CPU and add some more
> +         NUMA awareness to the kernel.
> +
> +config NODES_SHIFT
> +       int "Maximum NUMA Nodes (as a power of 2)"
> +       range 1 10
> +       default "2"
> +       depends on NEED_MULTIPLE_NODES
> +       help
> +         Specify the maximum number of NUMA Nodes available on the target
> +         system.  Increases memory reserved to accommodate various tables.
> +
> +config USE_PERCPU_NUMA_NODE_ID
> +       def_bool y
> +       depends on NUMA
> +
> +
>  source kernel/Kconfig.preempt
>
>  config UP_LATE_INIT
> diff --git a/arch/arm64/include/asm/mmzone.h b/arch/arm64/include/asm/mmzone.h
> new file mode 100644
> index 0000000..d27ee66
> --- /dev/null
> +++ b/arch/arm64/include/asm/mmzone.h
> @@ -0,0 +1,32 @@
> +#ifndef __ASM_ARM64_MMZONE_H_
> +#define __ASM_ARM64_MMZONE_H_
> +
> +#ifdef CONFIG_NUMA
> +
> +#include <linux/mmdebug.h>
> +#include <asm/smp.h>
> +#include <linux/types.h>
> +#include <asm/numa.h>
> +
> +extern struct pglist_data *node_data[];
> +
> +#define NODE_DATA(nid)         (node_data[nid])
> +
> +
> +struct numa_memblk {
> +       u64                     start;
> +       u64                     end;
> +       int                     nid;
> +};
> +
> +struct numa_meminfo {
> +       int                     nr_blks;
> +       struct numa_memblk      blk[NR_NODE_MEMBLKS];
> +};
> +
> +void __init numa_remove_memblk_from(int idx, struct numa_meminfo *mi);
> +int __init numa_cleanup_meminfo(struct numa_meminfo *mi);
> +void __init numa_reset_distance(void);
> +
> +#endif /* CONFIG_NUMA */
> +#endif /* __ASM_ARM64_MMZONE_H_ */
> diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h
> new file mode 100644
> index 0000000..59b834e
> --- /dev/null
> +++ b/arch/arm64/include/asm/numa.h
> @@ -0,0 +1,42 @@
> +#ifndef _ASM_NUMA_H
> +#define _ASM_NUMA_H
> +
> +#include <linux/nodemask.h>
> +#include <asm/topology.h>
> +
> +#ifdef CONFIG_NUMA
> +
> +#define NR_NODE_MEMBLKS                (MAX_NUMNODES * 2)
> +#define ZONE_ALIGN (1UL << (MAX_ORDER + PAGE_SHIFT))
> +
> +/* currently, arm64 implements flat NUMA topology */
> +#define parent_node(node)      (node)
> +
> +/* dummy definitions for pci functions */
> +#define pcibus_to_node(node)   0
> +#define cpumask_of_pcibus(bus) 0
> +
> +struct __node_cpu_hwid {
> +       u32 node_id;    /* logical node containing this CPU */
> +       u64 cpu_hwid;   /* MPIDR for this CPU */
> +};
> +
> +extern struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
> +extern nodemask_t numa_nodes_parsed __initdata;
> +
> +const struct cpumask *cpumask_of_node(int node);
> +/* Mappings between node number and cpus on that node. */
> +extern cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
> +
> +void __init arm64_numa_init(void);
> +int __init numa_add_memblk(u32 nodeid, u64 start, u64 end);
> +void numa_store_cpu_info(int cpu);
> +void __init build_cpu_to_node_map(void);
> +void __init numa_set_distance(int from, int to, int distance);
> +#else  /* CONFIG_NUMA */
> +static inline void numa_store_cpu_info(int cpu)                { }
> +static inline void arm64_numa_init(void)               { }
> +static inline void build_cpu_to_node_map(void) { }
> +static inline void numa_set_distance(int from, int to, int distance) { }
> +#endif /* CONFIG_NUMA */
> +#endif /* _ASM_NUMA_H */
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 4d4d7ce..6e101eb 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -65,6 +65,7 @@
>  #include <asm/efi.h>
>  #include <asm/virt.h>
>  #include <asm/xen/hypervisor.h>
> +#include <asm/numa.h>
>
>  unsigned long elf_hwcap __read_mostly;
>  EXPORT_SYMBOL_GPL(elf_hwcap);
> @@ -439,6 +440,9 @@ static int __init topology_init(void)
>  {
>         int i;
>
> +       for_each_online_node(i)
> +               register_one_node(i);
> +
>         for_each_possible_cpu(i) {
>                 struct cpu *cpu = &per_cpu(cpu_data.cpu, i);
>                 cpu->hotpluggable = 1;
> @@ -511,7 +515,12 @@ static int c_show(struct seq_file *m, void *v)
>                  * "processor".  Give glibc what it expects.
>                  */
>  #ifdef CONFIG_SMP
> +       if (IS_ENABLED(CONFIG_NUMA)) {
> +               seq_printf(m, "processor\t: %d", i);
> +               seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
> +       } else {
>                 seq_printf(m, "processor\t: %d\n", i);
> +       }
>  #endif
>
>                 /*
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 50fb469..ae3e02c 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -52,6 +52,7 @@
>  #include <asm/sections.h>
>  #include <asm/tlbflush.h>
>  #include <asm/ptrace.h>
> +#include <asm/numa.h>
>
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/ipi.h>
> @@ -124,6 +125,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  static void smp_store_cpu_info(unsigned int cpuid)
>  {
>         store_cpu_topology(cpuid);
> +       numa_store_cpu_info(cpuid);
>  }
>
>  /*
> diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> index 773d37a..bb92d41 100644
> --- a/arch/arm64/mm/Makefile
> +++ b/arch/arm64/mm/Makefile
> @@ -4,3 +4,4 @@ obj-y                           := dma-mapping.o extable.o fault.o init.o \
>                                    context.o proc.o pageattr.o
>  obj-$(CONFIG_HUGETLB_PAGE)     += hugetlbpage.o
>  obj-$(CONFIG_ARM64_PTDUMP)     += dump.o
> +obj-$(CONFIG_NUMA)             += numa.o
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 54da32e..cab384b 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -42,6 +42,7 @@
>  #include <asm/sizes.h>
>  #include <asm/tlb.h>
>  #include <asm/alternative.h>
> +#include <asm/numa.h>
>
>  #include "mm.h"
>
> @@ -77,6 +78,20 @@ static phys_addr_t max_zone_dma_phys(void)
>         return min(offset + (1ULL << 32), memblock_end_of_DRAM());
>  }
>
> +#ifdef CONFIG_NUMA
> +static void __init zone_sizes_init(unsigned long min, unsigned long max)
> +{
> +       unsigned long max_zone_pfns[MAX_NR_ZONES];
> +
> +       memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
> +       if (IS_ENABLED(CONFIG_ZONE_DMA))
> +               max_zone_pfns[ZONE_DMA] = PFN_DOWN(max_zone_dma_phys());
> +       max_zone_pfns[ZONE_NORMAL] = max;
> +
> +       free_area_init_nodes(max_zone_pfns);
> +}
> +
> +#else
>  static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  {
>         struct memblock_region *reg;
> @@ -115,6 +130,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>
>         free_area_init_node(0, zone_size, min, zhole_size);
>  }
> +#endif /* CONFIG_NUMA */
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  int pfn_valid(unsigned long pfn)
> @@ -132,10 +148,15 @@ static void arm64_memory_present(void)
>  static void arm64_memory_present(void)
>  {
>         struct memblock_region *reg;
> +       int nid = 0;
>
> -       for_each_memblock(memory, reg)
> -               memory_present(0, memblock_region_memory_base_pfn(reg),
> -                              memblock_region_memory_end_pfn(reg));
> +       for_each_memblock(memory, reg) {
> +#ifdef CONFIG_NUMA
> +               nid = reg->nid;
> +#endif
> +               memory_present(nid, memblock_region_memory_base_pfn(reg),
> +                               memblock_region_memory_end_pfn(reg));
> +       }
>  }
>  #endif
>
> @@ -200,6 +221,10 @@ void __init bootmem_init(void)
>         min = PFN_UP(memblock_start_of_DRAM());
>         max = PFN_DOWN(memblock_end_of_DRAM());
>
> +       high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
> +       max_pfn = max_low_pfn = max;
> +
> +       arm64_numa_init();
>         /*
>          * Sparsemem tries to allocate bootmem in memory_present(), so must be
>          * done after the fixed reservations.
> @@ -208,9 +233,6 @@ void __init bootmem_init(void)
>
>         sparse_init();
>         zone_sizes_init(min, max);
> -
> -       high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
> -       max_pfn = max_low_pfn = max;
>  }
>
>  #ifndef CONFIG_SPARSEMEM_VMEMMAP
> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> new file mode 100644
> index 0000000..2be83de
> --- /dev/null
> +++ b/arch/arm64/mm/numa.c
> @@ -0,0 +1,550 @@
> +/*
> + * NUMA support, based on the x86 implementation.
> + *
> + * Copyright (C) 2015 Cavium Inc.
> + * Author: Ganapatrao Kulkarni <gkulkarni-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/mm.h>
> +#include <linux/string.h>
> +#include <linux/init.h>
> +#include <linux/bootmem.h>
> +#include <linux/memblock.h>
> +#include <linux/ctype.h>
> +#include <linux/module.h>
> +#include <linux/nodemask.h>
> +#include <linux/sched.h>
> +#include <linux/topology.h>
> +#include <linux/mmzone.h>
> +
> +#include <asm/smp_plat.h>
> +
> +int __initdata numa_off;
> +nodemask_t numa_nodes_parsed __initdata;
> +static int numa_distance_cnt;
> +static u8 *numa_distance;
> +static u8 dummy_numa_enabled;
> +
> +struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
> +EXPORT_SYMBOL(node_data);
> +
> +struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
> +static struct numa_meminfo numa_meminfo;
> +
> +static __init int numa_setup(char *opt)
> +{
> +       if (!opt)
> +               return -EINVAL;
> +       if (!strncmp(opt, "off", 3)) {
> +               pr_info("%s\n", "NUMA turned off");
> +               numa_off = 1;
> +       }
> +       return 0;
> +}
> +early_param("numa", numa_setup);
> +
> +cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
> +EXPORT_SYMBOL(node_to_cpumask_map);
> +
> +int cpu_to_node_map[NR_CPUS];
> +EXPORT_SYMBOL(cpu_to_node_map);
> +
> +/*
> + * Returns a pointer to the bitmask of CPUs on Node 'node'.
> + */
> +const struct cpumask *cpumask_of_node(int node)
> +{
> +       if (node >= nr_node_ids) {
> +               pr_warn("cpumask_of_node(%d): node > nr_node_ids(%d)\n",
> +                       node, nr_node_ids);
> +               dump_stack();
> +               return cpu_none_mask;
> +       }
> +       if (node_to_cpumask_map[node] == NULL) {
> +               pr_warn("cpumask_of_node(%d): no node_to_cpumask_map!\n",
> +                       node);
> +               dump_stack();
> +               return cpu_online_mask;
> +       }
> +       return node_to_cpumask_map[node];
> +}
> +EXPORT_SYMBOL(cpumask_of_node);
> +
> +void numa_clear_node(int cpu)
> +{
> +       set_cpu_numa_node(cpu, NUMA_NO_NODE);
> +}
> +
> +void map_cpu_to_node(int cpu, int nid)
> +{
> +       if (nid < 0) { /* just initialize by zero */
> +               cpu_to_node_map[cpu] = 0;
> +               return;
> +       }
> +
> +       cpu_to_node_map[cpu] = nid;
> +       cpumask_set_cpu(cpu, node_to_cpumask_map[nid]);
> +       set_numa_node(nid);
> +}
> +
> +/**
> + * build_cpu_to_node_map - setup cpu to node and node to cpumask arrays
> + *
> + * Build cpu to node mapping and initialize the per node cpu masks using
> + * info from the node_cpuid array handed to us by ACPI or DT.
> + */
> +void __init build_cpu_to_node_map(void)
> +{
> +       int cpu, i, node;
> +
> +       for (node = 0; node < MAX_NUMNODES; node++)
> +               cpumask_clear(node_to_cpumask_map[node]);
> +
> +       for_each_possible_cpu(cpu) {
> +               node = NUMA_NO_NODE;
> +               for_each_possible_cpu(i) {
> +                       if (cpu_logical_map(cpu) == node_cpu_hwid[i].cpu_hwid) {
> +                               node = node_cpu_hwid[i].node_id;
> +                               break;
> +                       }
> +               }
> +               map_cpu_to_node(cpu, node);
> +       }
> +}
> +/*
> + * Allocate node_to_cpumask_map based on number of available nodes
> + * Requires node_possible_map to be valid.
> + *
> + * Note: cpumask_of_node() is not valid until after this is done.
> + * (Use CONFIG_DEBUG_PER_CPU_MAPS to check this.)
> + */
> +void __init setup_node_to_cpumask_map(void)
> +{
> +       unsigned int node;
> +
> +       /* setup nr_node_ids if not done yet */
> +       if (nr_node_ids == MAX_NUMNODES)
> +               setup_nr_node_ids();
> +
> +       /* allocate the map */
> +       for (node = 0; node < nr_node_ids; node++)
> +               alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]);
> +
> +       /* cpumask_of_node() will now work */
> +       pr_debug("Node to cpumask map for %d nodes\n", nr_node_ids);
> +}
> +
> +/*
> + *  Set the cpu to node and mem mapping
> + */
> +void numa_store_cpu_info(int cpu)
> +{
> +       if (dummy_numa_enabled) {
> +               /* set to default */
> +               node_cpu_hwid[cpu].node_id  =  0;
> +               node_cpu_hwid[cpu].cpu_hwid = cpu_logical_map(cpu);
> +       }
> +       map_cpu_to_node(cpu, node_cpu_hwid[cpu].node_id);
> +}
> +
> +/**
> + * numa_add_memblk_to - Add one numa_memblk to a numa_meminfo
> + */
> +
> +static int __init numa_add_memblk_to(int nid, u64 start, u64 end,
> +                                    struct numa_meminfo *mi)
> +{
> +       /* ignore zero length blks */
> +       if (start == end)
> +               return 0;
> +
> +       /* whine about and ignore invalid blks */
> +       if (start > end || nid < 0 || nid >= MAX_NUMNODES) {
> +               pr_warn("NUMA: Warning: invalid memblk node %d [mem %#010Lx-%#010Lx]\n",
> +                               nid, start, end - 1);
> +               return 0;
> +       }
> +
> +       if (mi->nr_blks >= NR_NODE_MEMBLKS) {
> +               pr_err("NUMA: too many memblk ranges\n");
> +               return -EINVAL;
> +       }
> +
> +       pr_info("NUMA: Adding memblock %d [0x%llx - 0x%llx] on node %d\n",
> +                       mi->nr_blks, start, end, nid);
> +       mi->blk[mi->nr_blks].start = start;
> +       mi->blk[mi->nr_blks].end = end;
> +       mi->blk[mi->nr_blks].nid = nid;
> +       mi->nr_blks++;
> +       return 0;
> +}
> +
> +/**
> + * numa_add_memblk - Add one numa_memblk to numa_meminfo
> + * @nid: NUMA node ID of the new memblk
> + * @start: Start address of the new memblk
> + * @end: End address of the new memblk
> + *
> + * Add a new memblk to the default numa_meminfo.
> + *
> + * RETURNS:
> + * 0 on success, -errno on failure.
> + */
> +#define MAX_PHYS_ADDR  ((phys_addr_t)~0)
> +
> +int __init numa_add_memblk(u32 nid, u64 base, u64 end)
> +{
> +       const u64 phys_offset = __pa(PAGE_OFFSET);
> +
> +       base &= PAGE_MASK;
> +       end &= PAGE_MASK;
> +
> +       if (base > MAX_PHYS_ADDR) {
> +               pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
> +                               base, base + end);
> +               return -ENOMEM;
> +       }
> +
> +       if (base + end > MAX_PHYS_ADDR) {
> +               pr_info("NUMA: Ignoring memory range 0x%lx - 0x%llx\n",
> +                               ULONG_MAX, base + end);
> +               end = MAX_PHYS_ADDR - base;
> +       }
> +
> +       if (base + end < phys_offset) {
> +               pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
> +                          base, base + end);
> +               return -ENOMEM;
> +       }
> +       if (base < phys_offset) {
> +               pr_info("NUMA: Ignoring memory range 0x%llx - 0x%llx\n",
> +                          base, phys_offset);
> +               end -= phys_offset - base;
> +               base = phys_offset;
> +       }
> +
> +       return numa_add_memblk_to(nid, base, base + end, &numa_meminfo);
> +}
> +EXPORT_SYMBOL(numa_add_memblk);
> +
> +/* Initialize NODE_DATA for a node on the local memory */
> +static void __init setup_node_data(int nid, u64 start, u64 end)
> +{
> +       const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
> +       u64 nd_pa;
> +       void *nd;
> +       int tnid;
> +
> +       start = roundup(start, ZONE_ALIGN);
> +
> +       pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n",
> +              nid, start, end - 1);
> +
> +       /*
> +        * Allocate node data.  Try node-local memory and then any node.
> +        */
> +       nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
> +       if (!nd_pa) {
> +               nd_pa = __memblock_alloc_base(nd_size, SMP_CACHE_BYTES,
> +                                             MEMBLOCK_ALLOC_ACCESSIBLE);
> +               if (!nd_pa) {
> +                       pr_err("Cannot find %zu bytes in node %d\n",
> +                              nd_size, nid);
> +                       return;
> +               }
> +       }
> +       nd = __va(nd_pa);
> +
> +       /* report and initialize */
> +       pr_info("  NODE_DATA [mem %#010Lx-%#010Lx]\n",
> +              nd_pa, nd_pa + nd_size - 1);
> +       tnid = early_pfn_to_nid(nd_pa >> PAGE_SHIFT);
> +       if (tnid != nid)
> +               pr_info("    NODE_DATA(%d) on node %d\n", nid, tnid);
> +
> +       node_data[nid] = nd;
> +       memset(NODE_DATA(nid), 0, sizeof(pg_data_t));
> +       NODE_DATA(nid)->node_id = nid;
> +       NODE_DATA(nid)->node_start_pfn = start >> PAGE_SHIFT;
> +       NODE_DATA(nid)->node_spanned_pages = (end - start) >> PAGE_SHIFT;
> +
> +       node_set_online(nid);
> +}
> +
> +/*
> + * Set nodes, which have memory in @mi, in *@nodemask.
> + */
> +static void __init numa_nodemask_from_meminfo(nodemask_t *nodemask,
> +                                             const struct numa_meminfo *mi)
> +{
> +       int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(mi->blk); i++)
> +               if (mi->blk[i].start != mi->blk[i].end &&
> +                   mi->blk[i].nid != NUMA_NO_NODE)
> +                       node_set(mi->blk[i].nid, *nodemask);
> +}
> +
> +/*
> + * Sanity check to catch more bad NUMA configurations (they are amazingly
> + * common).  Make sure the nodes cover all memory.
> + */
> +static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
> +{
> +       u64 numaram, totalram;
> +       int i;
> +
> +       numaram = 0;
> +       for (i = 0; i < mi->nr_blks; i++) {
> +               u64 s = mi->blk[i].start >> PAGE_SHIFT;
> +               u64 e = mi->blk[i].end >> PAGE_SHIFT;
> +
> +               numaram += e - s;
> +               numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
> +               if ((s64)numaram < 0)
> +                       numaram = 0;
> +       }
> +
> +       totalram = max_pfn - absent_pages_in_range(0, max_pfn);
> +
> +       /* We seem to lose 3 pages somewhere. Allow 1M of slack. */
> +       if ((s64)(totalram - numaram) >= (1 << (20 - PAGE_SHIFT))) {
> +               pr_err("NUMA: nodes only cover %lluMB of your %lluMB Total RAM. Not used.\n",
> +                      (numaram << PAGE_SHIFT) >> 20,
> +                      (totalram << PAGE_SHIFT) >> 20);
> +               return false;
> +       }
> +       return true;
> +}
> +
> +/**
> + * numa_reset_distance - Reset NUMA distance table
> + *
> + * The current table is freed.  The next numa_set_distance() call will
> + * create a new one.
> + */
> +void __init numa_reset_distance(void)
> +{
> +       size_t size = numa_distance_cnt * numa_distance_cnt *
> +               sizeof(numa_distance[0]);
> +
> +       /* numa_distance could be 1LU marking allocation failure, test cnt */
> +       if (numa_distance_cnt)
> +               memblock_free(__pa(numa_distance), size);
> +       numa_distance_cnt = 0;
> +       numa_distance = NULL;   /* enable table creation */
> +}
> +
> +static int __init numa_alloc_distance(void)
> +{
> +       nodemask_t nodes_parsed;
> +       size_t size;
> +       int i, j, cnt = 0;
> +       u64 phys;
> +
> +       /* size the new table and allocate it */
> +       nodes_parsed = numa_nodes_parsed;
> +       numa_nodemask_from_meminfo(&nodes_parsed, &numa_meminfo);
> +
> +       for_each_node_mask(i, nodes_parsed)
> +               cnt = i;
> +       cnt++;
> +       size = cnt * cnt * sizeof(numa_distance[0]);
> +
> +       phys = memblock_find_in_range(0, PFN_PHYS(max_pfn),
> +                                     size, PAGE_SIZE);
> +       if (!phys) {
> +               pr_warn("NUMA: Warning: can't allocate distance table!\n");
> +               /* don't retry until explicitly reset */
> +               numa_distance = (void *)1LU;
> +               return -ENOMEM;
> +       }
> +       memblock_reserve(phys, size);
> +
> +       numa_distance = __va(phys);
> +       numa_distance_cnt = cnt;
> +
> +       /* fill with the default distances */
> +       for (i = 0; i < cnt; i++)
> +               for (j = 0; j < cnt; j++)
> +                       numa_distance[i * cnt + j] = i == j ?
> +                               LOCAL_DISTANCE : REMOTE_DISTANCE;
> +       pr_debug("NUMA: Initialized distance table, cnt=%d\n", cnt);
> +
> +       return 0;
> +}
> +
> +/**
> + * numa_set_distance - Set NUMA distance from one NUMA to another
> + * @from: the 'from' node to set distance
> + * @to: the 'to'  node to set distance
> + * @distance: NUMA distance
> + *
> + * Set the distance from node @from to @to to @distance.  If distance table
> + * doesn't exist, one which is large enough to accommodate all the currently
> + * known nodes will be created.
> + *
> + * If such table cannot be allocated, a warning is printed and further
> + * calls are ignored until the distance table is reset with
> + * numa_reset_distance().
> + *
> + * If @from or @to is higher than the highest known node or lower than zero
> + * at the time of table creation or @distance doesn't make sense, the call
> + * is ignored.
> + * This is to allow simplification of specific NUMA config implementations.
> + */
> +void __init numa_set_distance(int from, int to, int distance)
> +{
> +       if (!numa_distance && numa_alloc_distance() < 0)
> +               return;
> +
> +       if (from >= numa_distance_cnt || to >= numa_distance_cnt ||
> +                       from < 0 || to < 0) {
> +               pr_warn_once("NUMA: Warning: node ids are out of bound, from=%d to=%d distance=%d\n",
> +                           from, to, distance);
> +               return;
> +       }
> +
> +       if ((u8)distance != distance ||
> +           (from == to && distance != LOCAL_DISTANCE)) {
> +               pr_warn_once("NUMA: Warning: invalid distance parameter, from=%d to=%d distance=%d\n",
> +                            from, to, distance);
> +               return;
> +       }
> +
> +       numa_distance[from * numa_distance_cnt + to] = distance;
> +}
> +EXPORT_SYMBOL(numa_set_distance);
> +
> +int __node_distance(int from, int to)
> +{
> +       if (from >= numa_distance_cnt || to >= numa_distance_cnt)
> +               return from == to ? LOCAL_DISTANCE : REMOTE_DISTANCE;
> +       return numa_distance[from * numa_distance_cnt + to];
> +}
> +EXPORT_SYMBOL(__node_distance);
> +
> +static int __init numa_register_memblks(struct numa_meminfo *mi)
> +{
> +       unsigned long uninitialized_var(pfn_align);
> +       int i, nid;
> +
> +       /* Account for nodes with cpus and no memory */
> +       node_possible_map = numa_nodes_parsed;
> +       numa_nodemask_from_meminfo(&node_possible_map, mi);
> +       if (WARN_ON(nodes_empty(node_possible_map)))
> +               return -EINVAL;
> +
> +       for (i = 0; i < mi->nr_blks; i++) {
> +               struct numa_memblk *mb = &mi->blk[i];
> +
> +               memblock_set_node(mb->start, mb->end - mb->start,
> +                                 &memblock.memory, mb->nid);
> +       }
> +
> +       /*
> +        * If sections array is gonna be used for pfn -> nid mapping, check
> +        * whether its granularity is fine enough.
> +        */
> +#ifdef NODE_NOT_IN_PAGE_FLAGS
> +       pfn_align = node_map_pfn_alignment();
> +       if (pfn_align && pfn_align < PAGES_PER_SECTION) {
> +               pr_warn("Node alignment %lluMB < min %lluMB, rejecting NUMA config\n",
> +                      PFN_PHYS(pfn_align) >> 20,
> +                      PFN_PHYS(PAGES_PER_SECTION) >> 20);
> +               return -EINVAL;
> +       }
> +#endif
> +       if (!numa_meminfo_cover_memory(mi))
> +               return -EINVAL;
> +
> +       /* Finally register nodes. */
> +       for_each_node_mask(nid, node_possible_map) {
> +               u64 start = PFN_PHYS(max_pfn);
> +               u64 end = 0;
> +
> +               for (i = 0; i < mi->nr_blks; i++) {
> +                       if (nid != mi->blk[i].nid)
> +                               continue;
> +                       start = min(mi->blk[i].start, start);
> +                       end = max(mi->blk[i].end, end);
> +               }
> +
> +               if (start < end)
> +                       setup_node_data(nid, start, end);
> +       }
> +
> +       /* Dump memblock with node info and return. */
> +       memblock_dump_all();
> +       return 0;
> +}
> +
> +static int __init numa_init(int (*init_func)(void))
> +{
> +       int ret, i;
> +
> +       nodes_clear(numa_nodes_parsed);
> +       nodes_clear(node_possible_map);
> +       nodes_clear(node_online_map);
> +
> +       ret = init_func();
> +       if (ret < 0)
> +               return ret;
> +
> +       ret = numa_register_memblks(&numa_meminfo);
> +       if (ret < 0)
> +               return ret;
> +
> +       for (i = 0; i < nr_cpu_ids; i++)
> +               numa_clear_node(i);
> +
> +       setup_node_to_cpumask_map();
> +       build_cpu_to_node_map();
> +       return 0;
> +}
> +
> +/**
> + * dummy_numa_init - Fallback dummy NUMA init
> + *
> + * Used if there's no underlying NUMA architecture, NUMA initialization
> + * fails, or NUMA is disabled on the command line.
> + *
> + * Must online at least one node and add memory blocks that cover all
> + * allowed memory.  This function must not fail.
> + */
> +static int __init dummy_numa_init(void)
> +{
> +       pr_info("%s\n", "No NUMA configuration found");
> +       pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n",
> +              0LLU, PFN_PHYS(max_pfn) - 1);
> +       node_set(0, numa_nodes_parsed);
> +       numa_add_memblk(0, 0, PFN_PHYS(max_pfn));
> +       dummy_numa_enabled = 1;
> +
> +       return 0;
> +}
> +
> +/**
> + * arm64_numa_init - Initialize NUMA
> + *
> + * Try each configured NUMA initialization method until one succeeds.  The
> + * last fallback is dummy single node config encomapssing whole memory and
> + * never fails.
> + */
> +void __init arm64_numa_init(void)
> +{
> +       numa_init(dummy_numa_init);
> +}
> --
> 1.8.1.4
>

thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 1/4] arm64, numa: adding numa support for arm64 platforms.
@ 2015-10-05  5:24       ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-05  5:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

can you please review this patch.

On Fri, Aug 14, 2015 at 10:09 PM, Ganapatrao Kulkarni
<gkulkarni@caviumnetworks.com> wrote:
> Adding numa support for arm64 based platforms.
> This patch adds by default the dummy numa node and
> maps all memory and cpus to node 0.
> using this patch, numa can be simulated on single node arm64 platforms.
>
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
> ---
>  arch/arm64/Kconfig              |  26 ++
>  arch/arm64/include/asm/mmzone.h |  32 +++
>  arch/arm64/include/asm/numa.h   |  42 +++
>  arch/arm64/kernel/setup.c       |   9 +
>  arch/arm64/kernel/smp.c         |   2 +
>  arch/arm64/mm/Makefile          |   1 +
>  arch/arm64/mm/init.c            |  34 ++-
>  arch/arm64/mm/numa.c            | 550 ++++++++++++++++++++++++++++++++++++++++
>  8 files changed, 690 insertions(+), 6 deletions(-)
>  create mode 100644 arch/arm64/include/asm/mmzone.h
>  create mode 100644 arch/arm64/include/asm/numa.h
>  create mode 100644 arch/arm64/mm/numa.c
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 43a0c26..fa37a5d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -72,6 +72,7 @@ config ARM64
>         select HAVE_PERF_USER_STACK_DUMP
>         select HAVE_RCU_TABLE_FREE
>         select HAVE_SYSCALL_TRACEPOINTS
> +       select HAVE_MEMBLOCK_NODE_MAP if NUMA
>         select IRQ_DOMAIN
>         select IRQ_FORCED_THREADING
>         select MODULES_USE_ELF_RELA
> @@ -559,6 +560,31 @@ config HOTPLUG_CPU
>           Say Y here to experiment with turning CPUs off and on.  CPUs
>           can be controlled through /sys/devices/system/cpu.
>
> +# Common NUMA Features
> +config NUMA
> +       bool "Numa Memory Allocation and Scheduler Support"
> +       depends on SMP
> +       help
> +         Enable NUMA (Non Uniform Memory Access) support.
> +
> +         The kernel will try to allocate memory used by a CPU on the
> +         local memory controller of the CPU and add some more
> +         NUMA awareness to the kernel.
> +
> +config NODES_SHIFT
> +       int "Maximum NUMA Nodes (as a power of 2)"
> +       range 1 10
> +       default "2"
> +       depends on NEED_MULTIPLE_NODES
> +       help
> +         Specify the maximum number of NUMA Nodes available on the target
> +         system.  Increases memory reserved to accommodate various tables.
> +
> +config USE_PERCPU_NUMA_NODE_ID
> +       def_bool y
> +       depends on NUMA
> +
> +
>  source kernel/Kconfig.preempt
>
>  config UP_LATE_INIT
> diff --git a/arch/arm64/include/asm/mmzone.h b/arch/arm64/include/asm/mmzone.h
> new file mode 100644
> index 0000000..d27ee66
> --- /dev/null
> +++ b/arch/arm64/include/asm/mmzone.h
> @@ -0,0 +1,32 @@
> +#ifndef __ASM_ARM64_MMZONE_H_
> +#define __ASM_ARM64_MMZONE_H_
> +
> +#ifdef CONFIG_NUMA
> +
> +#include <linux/mmdebug.h>
> +#include <asm/smp.h>
> +#include <linux/types.h>
> +#include <asm/numa.h>
> +
> +extern struct pglist_data *node_data[];
> +
> +#define NODE_DATA(nid)         (node_data[nid])
> +
> +
> +struct numa_memblk {
> +       u64                     start;
> +       u64                     end;
> +       int                     nid;
> +};
> +
> +struct numa_meminfo {
> +       int                     nr_blks;
> +       struct numa_memblk      blk[NR_NODE_MEMBLKS];
> +};
> +
> +void __init numa_remove_memblk_from(int idx, struct numa_meminfo *mi);
> +int __init numa_cleanup_meminfo(struct numa_meminfo *mi);
> +void __init numa_reset_distance(void);
> +
> +#endif /* CONFIG_NUMA */
> +#endif /* __ASM_ARM64_MMZONE_H_ */
> diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h
> new file mode 100644
> index 0000000..59b834e
> --- /dev/null
> +++ b/arch/arm64/include/asm/numa.h
> @@ -0,0 +1,42 @@
> +#ifndef _ASM_NUMA_H
> +#define _ASM_NUMA_H
> +
> +#include <linux/nodemask.h>
> +#include <asm/topology.h>
> +
> +#ifdef CONFIG_NUMA
> +
> +#define NR_NODE_MEMBLKS                (MAX_NUMNODES * 2)
> +#define ZONE_ALIGN (1UL << (MAX_ORDER + PAGE_SHIFT))
> +
> +/* currently, arm64 implements flat NUMA topology */
> +#define parent_node(node)      (node)
> +
> +/* dummy definitions for pci functions */
> +#define pcibus_to_node(node)   0
> +#define cpumask_of_pcibus(bus) 0
> +
> +struct __node_cpu_hwid {
> +       u32 node_id;    /* logical node containing this CPU */
> +       u64 cpu_hwid;   /* MPIDR for this CPU */
> +};
> +
> +extern struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
> +extern nodemask_t numa_nodes_parsed __initdata;
> +
> +const struct cpumask *cpumask_of_node(int node);
> +/* Mappings between node number and cpus on that node. */
> +extern cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
> +
> +void __init arm64_numa_init(void);
> +int __init numa_add_memblk(u32 nodeid, u64 start, u64 end);
> +void numa_store_cpu_info(int cpu);
> +void __init build_cpu_to_node_map(void);
> +void __init numa_set_distance(int from, int to, int distance);
> +#else  /* CONFIG_NUMA */
> +static inline void numa_store_cpu_info(int cpu)                { }
> +static inline void arm64_numa_init(void)               { }
> +static inline void build_cpu_to_node_map(void) { }
> +static inline void numa_set_distance(int from, int to, int distance) { }
> +#endif /* CONFIG_NUMA */
> +#endif /* _ASM_NUMA_H */
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 4d4d7ce..6e101eb 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -65,6 +65,7 @@
>  #include <asm/efi.h>
>  #include <asm/virt.h>
>  #include <asm/xen/hypervisor.h>
> +#include <asm/numa.h>
>
>  unsigned long elf_hwcap __read_mostly;
>  EXPORT_SYMBOL_GPL(elf_hwcap);
> @@ -439,6 +440,9 @@ static int __init topology_init(void)
>  {
>         int i;
>
> +       for_each_online_node(i)
> +               register_one_node(i);
> +
>         for_each_possible_cpu(i) {
>                 struct cpu *cpu = &per_cpu(cpu_data.cpu, i);
>                 cpu->hotpluggable = 1;
> @@ -511,7 +515,12 @@ static int c_show(struct seq_file *m, void *v)
>                  * "processor".  Give glibc what it expects.
>                  */
>  #ifdef CONFIG_SMP
> +       if (IS_ENABLED(CONFIG_NUMA)) {
> +               seq_printf(m, "processor\t: %d", i);
> +               seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
> +       } else {
>                 seq_printf(m, "processor\t: %d\n", i);
> +       }
>  #endif
>
>                 /*
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 50fb469..ae3e02c 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -52,6 +52,7 @@
>  #include <asm/sections.h>
>  #include <asm/tlbflush.h>
>  #include <asm/ptrace.h>
> +#include <asm/numa.h>
>
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/ipi.h>
> @@ -124,6 +125,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>  static void smp_store_cpu_info(unsigned int cpuid)
>  {
>         store_cpu_topology(cpuid);
> +       numa_store_cpu_info(cpuid);
>  }
>
>  /*
> diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> index 773d37a..bb92d41 100644
> --- a/arch/arm64/mm/Makefile
> +++ b/arch/arm64/mm/Makefile
> @@ -4,3 +4,4 @@ obj-y                           := dma-mapping.o extable.o fault.o init.o \
>                                    context.o proc.o pageattr.o
>  obj-$(CONFIG_HUGETLB_PAGE)     += hugetlbpage.o
>  obj-$(CONFIG_ARM64_PTDUMP)     += dump.o
> +obj-$(CONFIG_NUMA)             += numa.o
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 54da32e..cab384b 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -42,6 +42,7 @@
>  #include <asm/sizes.h>
>  #include <asm/tlb.h>
>  #include <asm/alternative.h>
> +#include <asm/numa.h>
>
>  #include "mm.h"
>
> @@ -77,6 +78,20 @@ static phys_addr_t max_zone_dma_phys(void)
>         return min(offset + (1ULL << 32), memblock_end_of_DRAM());
>  }
>
> +#ifdef CONFIG_NUMA
> +static void __init zone_sizes_init(unsigned long min, unsigned long max)
> +{
> +       unsigned long max_zone_pfns[MAX_NR_ZONES];
> +
> +       memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
> +       if (IS_ENABLED(CONFIG_ZONE_DMA))
> +               max_zone_pfns[ZONE_DMA] = PFN_DOWN(max_zone_dma_phys());
> +       max_zone_pfns[ZONE_NORMAL] = max;
> +
> +       free_area_init_nodes(max_zone_pfns);
> +}
> +
> +#else
>  static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  {
>         struct memblock_region *reg;
> @@ -115,6 +130,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>
>         free_area_init_node(0, zone_size, min, zhole_size);
>  }
> +#endif /* CONFIG_NUMA */
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  int pfn_valid(unsigned long pfn)
> @@ -132,10 +148,15 @@ static void arm64_memory_present(void)
>  static void arm64_memory_present(void)
>  {
>         struct memblock_region *reg;
> +       int nid = 0;
>
> -       for_each_memblock(memory, reg)
> -               memory_present(0, memblock_region_memory_base_pfn(reg),
> -                              memblock_region_memory_end_pfn(reg));
> +       for_each_memblock(memory, reg) {
> +#ifdef CONFIG_NUMA
> +               nid = reg->nid;
> +#endif
> +               memory_present(nid, memblock_region_memory_base_pfn(reg),
> +                               memblock_region_memory_end_pfn(reg));
> +       }
>  }
>  #endif
>
> @@ -200,6 +221,10 @@ void __init bootmem_init(void)
>         min = PFN_UP(memblock_start_of_DRAM());
>         max = PFN_DOWN(memblock_end_of_DRAM());
>
> +       high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
> +       max_pfn = max_low_pfn = max;
> +
> +       arm64_numa_init();
>         /*
>          * Sparsemem tries to allocate bootmem in memory_present(), so must be
>          * done after the fixed reservations.
> @@ -208,9 +233,6 @@ void __init bootmem_init(void)
>
>         sparse_init();
>         zone_sizes_init(min, max);
> -
> -       high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
> -       max_pfn = max_low_pfn = max;
>  }
>
>  #ifndef CONFIG_SPARSEMEM_VMEMMAP
> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> new file mode 100644
> index 0000000..2be83de
> --- /dev/null
> +++ b/arch/arm64/mm/numa.c
> @@ -0,0 +1,550 @@
> +/*
> + * NUMA support, based on the x86 implementation.
> + *
> + * Copyright (C) 2015 Cavium Inc.
> + * Author: Ganapatrao Kulkarni <gkulkarni@cavium.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/mm.h>
> +#include <linux/string.h>
> +#include <linux/init.h>
> +#include <linux/bootmem.h>
> +#include <linux/memblock.h>
> +#include <linux/ctype.h>
> +#include <linux/module.h>
> +#include <linux/nodemask.h>
> +#include <linux/sched.h>
> +#include <linux/topology.h>
> +#include <linux/mmzone.h>
> +
> +#include <asm/smp_plat.h>
> +
> +int __initdata numa_off;
> +nodemask_t numa_nodes_parsed __initdata;
> +static int numa_distance_cnt;
> +static u8 *numa_distance;
> +static u8 dummy_numa_enabled;
> +
> +struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
> +EXPORT_SYMBOL(node_data);
> +
> +struct __node_cpu_hwid node_cpu_hwid[NR_CPUS];
> +static struct numa_meminfo numa_meminfo;
> +
> +static __init int numa_setup(char *opt)
> +{
> +       if (!opt)
> +               return -EINVAL;
> +       if (!strncmp(opt, "off", 3)) {
> +               pr_info("%s\n", "NUMA turned off");
> +               numa_off = 1;
> +       }
> +       return 0;
> +}
> +early_param("numa", numa_setup);
> +
> +cpumask_var_t node_to_cpumask_map[MAX_NUMNODES];
> +EXPORT_SYMBOL(node_to_cpumask_map);
> +
> +int cpu_to_node_map[NR_CPUS];
> +EXPORT_SYMBOL(cpu_to_node_map);
> +
> +/*
> + * Returns a pointer to the bitmask of CPUs on Node 'node'.
> + */
> +const struct cpumask *cpumask_of_node(int node)
> +{
> +       if (node >= nr_node_ids) {
> +               pr_warn("cpumask_of_node(%d): node > nr_node_ids(%d)\n",
> +                       node, nr_node_ids);
> +               dump_stack();
> +               return cpu_none_mask;
> +       }
> +       if (node_to_cpumask_map[node] == NULL) {
> +               pr_warn("cpumask_of_node(%d): no node_to_cpumask_map!\n",
> +                       node);
> +               dump_stack();
> +               return cpu_online_mask;
> +       }
> +       return node_to_cpumask_map[node];
> +}
> +EXPORT_SYMBOL(cpumask_of_node);
> +
> +void numa_clear_node(int cpu)
> +{
> +       set_cpu_numa_node(cpu, NUMA_NO_NODE);
> +}
> +
> +void map_cpu_to_node(int cpu, int nid)
> +{
> +       if (nid < 0) { /* just initialize by zero */
> +               cpu_to_node_map[cpu] = 0;
> +               return;
> +       }
> +
> +       cpu_to_node_map[cpu] = nid;
> +       cpumask_set_cpu(cpu, node_to_cpumask_map[nid]);
> +       set_numa_node(nid);
> +}
> +
> +/**
> + * build_cpu_to_node_map - setup cpu to node and node to cpumask arrays
> + *
> + * Build cpu to node mapping and initialize the per node cpu masks using
> + * info from the node_cpuid array handed to us by ACPI or DT.
> + */
> +void __init build_cpu_to_node_map(void)
> +{
> +       int cpu, i, node;
> +
> +       for (node = 0; node < MAX_NUMNODES; node++)
> +               cpumask_clear(node_to_cpumask_map[node]);
> +
> +       for_each_possible_cpu(cpu) {
> +               node = NUMA_NO_NODE;
> +               for_each_possible_cpu(i) {
> +                       if (cpu_logical_map(cpu) == node_cpu_hwid[i].cpu_hwid) {
> +                               node = node_cpu_hwid[i].node_id;
> +                               break;
> +                       }
> +               }
> +               map_cpu_to_node(cpu, node);
> +       }
> +}
> +/*
> + * Allocate node_to_cpumask_map based on number of available nodes
> + * Requires node_possible_map to be valid.
> + *
> + * Note: cpumask_of_node() is not valid until after this is done.
> + * (Use CONFIG_DEBUG_PER_CPU_MAPS to check this.)
> + */
> +void __init setup_node_to_cpumask_map(void)
> +{
> +       unsigned int node;
> +
> +       /* setup nr_node_ids if not done yet */
> +       if (nr_node_ids == MAX_NUMNODES)
> +               setup_nr_node_ids();
> +
> +       /* allocate the map */
> +       for (node = 0; node < nr_node_ids; node++)
> +               alloc_bootmem_cpumask_var(&node_to_cpumask_map[node]);
> +
> +       /* cpumask_of_node() will now work */
> +       pr_debug("Node to cpumask map for %d nodes\n", nr_node_ids);
> +}
> +
> +/*
> + *  Set the cpu to node and mem mapping
> + */
> +void numa_store_cpu_info(int cpu)
> +{
> +       if (dummy_numa_enabled) {
> +               /* set to default */
> +               node_cpu_hwid[cpu].node_id  =  0;
> +               node_cpu_hwid[cpu].cpu_hwid = cpu_logical_map(cpu);
> +       }
> +       map_cpu_to_node(cpu, node_cpu_hwid[cpu].node_id);
> +}
> +
> +/**
> + * numa_add_memblk_to - Add one numa_memblk to a numa_meminfo
> + */
> +
> +static int __init numa_add_memblk_to(int nid, u64 start, u64 end,
> +                                    struct numa_meminfo *mi)
> +{
> +       /* ignore zero length blks */
> +       if (start == end)
> +               return 0;
> +
> +       /* whine about and ignore invalid blks */
> +       if (start > end || nid < 0 || nid >= MAX_NUMNODES) {
> +               pr_warn("NUMA: Warning: invalid memblk node %d [mem %#010Lx-%#010Lx]\n",
> +                               nid, start, end - 1);
> +               return 0;
> +       }
> +
> +       if (mi->nr_blks >= NR_NODE_MEMBLKS) {
> +               pr_err("NUMA: too many memblk ranges\n");
> +               return -EINVAL;
> +       }
> +
> +       pr_info("NUMA: Adding memblock %d [0x%llx - 0x%llx] on node %d\n",
> +                       mi->nr_blks, start, end, nid);
> +       mi->blk[mi->nr_blks].start = start;
> +       mi->blk[mi->nr_blks].end = end;
> +       mi->blk[mi->nr_blks].nid = nid;
> +       mi->nr_blks++;
> +       return 0;
> +}
> +
> +/**
> + * numa_add_memblk - Add one numa_memblk to numa_meminfo
> + * @nid: NUMA node ID of the new memblk
> + * @start: Start address of the new memblk
> + * @end: End address of the new memblk
> + *
> + * Add a new memblk to the default numa_meminfo.
> + *
> + * RETURNS:
> + * 0 on success, -errno on failure.
> + */
> +#define MAX_PHYS_ADDR  ((phys_addr_t)~0)
> +
> +int __init numa_add_memblk(u32 nid, u64 base, u64 end)
> +{
> +       const u64 phys_offset = __pa(PAGE_OFFSET);
> +
> +       base &= PAGE_MASK;
> +       end &= PAGE_MASK;
> +
> +       if (base > MAX_PHYS_ADDR) {
> +               pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
> +                               base, base + end);
> +               return -ENOMEM;
> +       }
> +
> +       if (base + end > MAX_PHYS_ADDR) {
> +               pr_info("NUMA: Ignoring memory range 0x%lx - 0x%llx\n",
> +                               ULONG_MAX, base + end);
> +               end = MAX_PHYS_ADDR - base;
> +       }
> +
> +       if (base + end < phys_offset) {
> +               pr_warn("NUMA: Ignoring memory block 0x%llx - 0x%llx\n",
> +                          base, base + end);
> +               return -ENOMEM;
> +       }
> +       if (base < phys_offset) {
> +               pr_info("NUMA: Ignoring memory range 0x%llx - 0x%llx\n",
> +                          base, phys_offset);
> +               end -= phys_offset - base;
> +               base = phys_offset;
> +       }
> +
> +       return numa_add_memblk_to(nid, base, base + end, &numa_meminfo);
> +}
> +EXPORT_SYMBOL(numa_add_memblk);
> +
> +/* Initialize NODE_DATA for a node on the local memory */
> +static void __init setup_node_data(int nid, u64 start, u64 end)
> +{
> +       const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
> +       u64 nd_pa;
> +       void *nd;
> +       int tnid;
> +
> +       start = roundup(start, ZONE_ALIGN);
> +
> +       pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n",
> +              nid, start, end - 1);
> +
> +       /*
> +        * Allocate node data.  Try node-local memory and then any node.
> +        */
> +       nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
> +       if (!nd_pa) {
> +               nd_pa = __memblock_alloc_base(nd_size, SMP_CACHE_BYTES,
> +                                             MEMBLOCK_ALLOC_ACCESSIBLE);
> +               if (!nd_pa) {
> +                       pr_err("Cannot find %zu bytes in node %d\n",
> +                              nd_size, nid);
> +                       return;
> +               }
> +       }
> +       nd = __va(nd_pa);
> +
> +       /* report and initialize */
> +       pr_info("  NODE_DATA [mem %#010Lx-%#010Lx]\n",
> +              nd_pa, nd_pa + nd_size - 1);
> +       tnid = early_pfn_to_nid(nd_pa >> PAGE_SHIFT);
> +       if (tnid != nid)
> +               pr_info("    NODE_DATA(%d) on node %d\n", nid, tnid);
> +
> +       node_data[nid] = nd;
> +       memset(NODE_DATA(nid), 0, sizeof(pg_data_t));
> +       NODE_DATA(nid)->node_id = nid;
> +       NODE_DATA(nid)->node_start_pfn = start >> PAGE_SHIFT;
> +       NODE_DATA(nid)->node_spanned_pages = (end - start) >> PAGE_SHIFT;
> +
> +       node_set_online(nid);
> +}
> +
> +/*
> + * Set nodes, which have memory in @mi, in *@nodemask.
> + */
> +static void __init numa_nodemask_from_meminfo(nodemask_t *nodemask,
> +                                             const struct numa_meminfo *mi)
> +{
> +       int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(mi->blk); i++)
> +               if (mi->blk[i].start != mi->blk[i].end &&
> +                   mi->blk[i].nid != NUMA_NO_NODE)
> +                       node_set(mi->blk[i].nid, *nodemask);
> +}
> +
> +/*
> + * Sanity check to catch more bad NUMA configurations (they are amazingly
> + * common).  Make sure the nodes cover all memory.
> + */
> +static bool __init numa_meminfo_cover_memory(const struct numa_meminfo *mi)
> +{
> +       u64 numaram, totalram;
> +       int i;
> +
> +       numaram = 0;
> +       for (i = 0; i < mi->nr_blks; i++) {
> +               u64 s = mi->blk[i].start >> PAGE_SHIFT;
> +               u64 e = mi->blk[i].end >> PAGE_SHIFT;
> +
> +               numaram += e - s;
> +               numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
> +               if ((s64)numaram < 0)
> +                       numaram = 0;
> +       }
> +
> +       totalram = max_pfn - absent_pages_in_range(0, max_pfn);
> +
> +       /* We seem to lose 3 pages somewhere. Allow 1M of slack. */
> +       if ((s64)(totalram - numaram) >= (1 << (20 - PAGE_SHIFT))) {
> +               pr_err("NUMA: nodes only cover %lluMB of your %lluMB Total RAM. Not used.\n",
> +                      (numaram << PAGE_SHIFT) >> 20,
> +                      (totalram << PAGE_SHIFT) >> 20);
> +               return false;
> +       }
> +       return true;
> +}
> +
> +/**
> + * numa_reset_distance - Reset NUMA distance table
> + *
> + * The current table is freed.  The next numa_set_distance() call will
> + * create a new one.
> + */
> +void __init numa_reset_distance(void)
> +{
> +       size_t size = numa_distance_cnt * numa_distance_cnt *
> +               sizeof(numa_distance[0]);
> +
> +       /* numa_distance could be 1LU marking allocation failure, test cnt */
> +       if (numa_distance_cnt)
> +               memblock_free(__pa(numa_distance), size);
> +       numa_distance_cnt = 0;
> +       numa_distance = NULL;   /* enable table creation */
> +}
> +
> +static int __init numa_alloc_distance(void)
> +{
> +       nodemask_t nodes_parsed;
> +       size_t size;
> +       int i, j, cnt = 0;
> +       u64 phys;
> +
> +       /* size the new table and allocate it */
> +       nodes_parsed = numa_nodes_parsed;
> +       numa_nodemask_from_meminfo(&nodes_parsed, &numa_meminfo);
> +
> +       for_each_node_mask(i, nodes_parsed)
> +               cnt = i;
> +       cnt++;
> +       size = cnt * cnt * sizeof(numa_distance[0]);
> +
> +       phys = memblock_find_in_range(0, PFN_PHYS(max_pfn),
> +                                     size, PAGE_SIZE);
> +       if (!phys) {
> +               pr_warn("NUMA: Warning: can't allocate distance table!\n");
> +               /* don't retry until explicitly reset */
> +               numa_distance = (void *)1LU;
> +               return -ENOMEM;
> +       }
> +       memblock_reserve(phys, size);
> +
> +       numa_distance = __va(phys);
> +       numa_distance_cnt = cnt;
> +
> +       /* fill with the default distances */
> +       for (i = 0; i < cnt; i++)
> +               for (j = 0; j < cnt; j++)
> +                       numa_distance[i * cnt + j] = i == j ?
> +                               LOCAL_DISTANCE : REMOTE_DISTANCE;
> +       pr_debug("NUMA: Initialized distance table, cnt=%d\n", cnt);
> +
> +       return 0;
> +}
> +
> +/**
> + * numa_set_distance - Set NUMA distance from one NUMA to another
> + * @from: the 'from' node to set distance
> + * @to: the 'to'  node to set distance
> + * @distance: NUMA distance
> + *
> + * Set the distance from node @from to @to to @distance.  If distance table
> + * doesn't exist, one which is large enough to accommodate all the currently
> + * known nodes will be created.
> + *
> + * If such table cannot be allocated, a warning is printed and further
> + * calls are ignored until the distance table is reset with
> + * numa_reset_distance().
> + *
> + * If @from or @to is higher than the highest known node or lower than zero
> + * at the time of table creation or @distance doesn't make sense, the call
> + * is ignored.
> + * This is to allow simplification of specific NUMA config implementations.
> + */
> +void __init numa_set_distance(int from, int to, int distance)
> +{
> +       if (!numa_distance && numa_alloc_distance() < 0)
> +               return;
> +
> +       if (from >= numa_distance_cnt || to >= numa_distance_cnt ||
> +                       from < 0 || to < 0) {
> +               pr_warn_once("NUMA: Warning: node ids are out of bound, from=%d to=%d distance=%d\n",
> +                           from, to, distance);
> +               return;
> +       }
> +
> +       if ((u8)distance != distance ||
> +           (from == to && distance != LOCAL_DISTANCE)) {
> +               pr_warn_once("NUMA: Warning: invalid distance parameter, from=%d to=%d distance=%d\n",
> +                            from, to, distance);
> +               return;
> +       }
> +
> +       numa_distance[from * numa_distance_cnt + to] = distance;
> +}
> +EXPORT_SYMBOL(numa_set_distance);
> +
> +int __node_distance(int from, int to)
> +{
> +       if (from >= numa_distance_cnt || to >= numa_distance_cnt)
> +               return from == to ? LOCAL_DISTANCE : REMOTE_DISTANCE;
> +       return numa_distance[from * numa_distance_cnt + to];
> +}
> +EXPORT_SYMBOL(__node_distance);
> +
> +static int __init numa_register_memblks(struct numa_meminfo *mi)
> +{
> +       unsigned long uninitialized_var(pfn_align);
> +       int i, nid;
> +
> +       /* Account for nodes with cpus and no memory */
> +       node_possible_map = numa_nodes_parsed;
> +       numa_nodemask_from_meminfo(&node_possible_map, mi);
> +       if (WARN_ON(nodes_empty(node_possible_map)))
> +               return -EINVAL;
> +
> +       for (i = 0; i < mi->nr_blks; i++) {
> +               struct numa_memblk *mb = &mi->blk[i];
> +
> +               memblock_set_node(mb->start, mb->end - mb->start,
> +                                 &memblock.memory, mb->nid);
> +       }
> +
> +       /*
> +        * If sections array is gonna be used for pfn -> nid mapping, check
> +        * whether its granularity is fine enough.
> +        */
> +#ifdef NODE_NOT_IN_PAGE_FLAGS
> +       pfn_align = node_map_pfn_alignment();
> +       if (pfn_align && pfn_align < PAGES_PER_SECTION) {
> +               pr_warn("Node alignment %lluMB < min %lluMB, rejecting NUMA config\n",
> +                      PFN_PHYS(pfn_align) >> 20,
> +                      PFN_PHYS(PAGES_PER_SECTION) >> 20);
> +               return -EINVAL;
> +       }
> +#endif
> +       if (!numa_meminfo_cover_memory(mi))
> +               return -EINVAL;
> +
> +       /* Finally register nodes. */
> +       for_each_node_mask(nid, node_possible_map) {
> +               u64 start = PFN_PHYS(max_pfn);
> +               u64 end = 0;
> +
> +               for (i = 0; i < mi->nr_blks; i++) {
> +                       if (nid != mi->blk[i].nid)
> +                               continue;
> +                       start = min(mi->blk[i].start, start);
> +                       end = max(mi->blk[i].end, end);
> +               }
> +
> +               if (start < end)
> +                       setup_node_data(nid, start, end);
> +       }
> +
> +       /* Dump memblock with node info and return. */
> +       memblock_dump_all();
> +       return 0;
> +}
> +
> +static int __init numa_init(int (*init_func)(void))
> +{
> +       int ret, i;
> +
> +       nodes_clear(numa_nodes_parsed);
> +       nodes_clear(node_possible_map);
> +       nodes_clear(node_online_map);
> +
> +       ret = init_func();
> +       if (ret < 0)
> +               return ret;
> +
> +       ret = numa_register_memblks(&numa_meminfo);
> +       if (ret < 0)
> +               return ret;
> +
> +       for (i = 0; i < nr_cpu_ids; i++)
> +               numa_clear_node(i);
> +
> +       setup_node_to_cpumask_map();
> +       build_cpu_to_node_map();
> +       return 0;
> +}
> +
> +/**
> + * dummy_numa_init - Fallback dummy NUMA init
> + *
> + * Used if there's no underlying NUMA architecture, NUMA initialization
> + * fails, or NUMA is disabled on the command line.
> + *
> + * Must online at least one node and add memory blocks that cover all
> + * allowed memory.  This function must not fail.
> + */
> +static int __init dummy_numa_init(void)
> +{
> +       pr_info("%s\n", "No NUMA configuration found");
> +       pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n",
> +              0LLU, PFN_PHYS(max_pfn) - 1);
> +       node_set(0, numa_nodes_parsed);
> +       numa_add_memblk(0, 0, PFN_PHYS(max_pfn));
> +       dummy_numa_enabled = 1;
> +
> +       return 0;
> +}
> +
> +/**
> + * arm64_numa_init - Initialize NUMA
> + *
> + * Try each configured NUMA initialization method until one succeeds.  The
> + * last fallback is dummy single node config encomapssing whole memory and
> + * never fails.
> + */
> +void __init arm64_numa_init(void)
> +{
> +       numa_init(dummy_numa_init);
> +}
> --
> 1.8.1.4
>

thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 3/4] arm64, numa, dt: adding dt based numa support using dt node property arm, associativity
  2015-08-14 16:39   ` Ganapatrao Kulkarni
@ 2015-10-09 15:18       ` Catalin Marinas
  -1 siblings, 0 replies; 94+ messages in thread
From: Catalin Marinas @ 2015-10-09 15:18 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will.Deacon-5wv7dgnIgG8,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4,
	pawel.moll-5wv7dgnIgG8, mark.rutland-5wv7dgnIgG8,
	ijc+devicetree-KcIKpvwj1kUDXYZnReoRVg,
	galak-sgV2jX0FEOL9JmXXK+q4OQ, gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w

On Fri, Aug 14, 2015 at 10:09:33PM +0530, Ganapatrao Kulkarni wrote:
> Adding dt node pasring for numa topology using property arm,associativity.
> arm,associativity property can be used to map memory, cpu and
> io devices to numa node.
> 
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
> ---
>  arch/arm64/Kconfig            |   6 +
>  arch/arm64/include/asm/numa.h |   7 +
>  arch/arm64/kernel/Makefile    |   1 +
>  arch/arm64/kernel/dt_numa.c   | 316 ++++++++++++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/smp.c       |   1 +
>  arch/arm64/mm/numa.c          |  13 ++

Since a lot of code here is very similar to powerpc, any chance of
making it common, especially if we try to use the same bindings (maybe
with a different prefix ("arm," vs "ibm,")? See
https://members.openpowerfoundation.org/document/dl/469.

Also, since this series is from August, any chance to post another and
incorporate the feedback provided so far?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 3/4] arm64, numa, dt: adding dt based numa support using dt node property arm, associativity
@ 2015-10-09 15:18       ` Catalin Marinas
  0 siblings, 0 replies; 94+ messages in thread
From: Catalin Marinas @ 2015-10-09 15:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 14, 2015 at 10:09:33PM +0530, Ganapatrao Kulkarni wrote:
> Adding dt node pasring for numa topology using property arm,associativity.
> arm,associativity property can be used to map memory, cpu and
> io devices to numa node.
> 
> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
> ---
>  arch/arm64/Kconfig            |   6 +
>  arch/arm64/include/asm/numa.h |   7 +
>  arch/arm64/kernel/Makefile    |   1 +
>  arch/arm64/kernel/dt_numa.c   | 316 ++++++++++++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/smp.c       |   1 +
>  arch/arm64/mm/numa.c          |  13 ++

Since a lot of code here is very similar to powerpc, any chance of
making it common, especially if we try to use the same bindings (maybe
with a different prefix ("arm," vs "ibm,")? See
https://members.openpowerfoundation.org/document/dl/469.

Also, since this series is from August, any chance to post another and
incorporate the feedback provided so far?

-- 
Catalin

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 3/4] arm64, numa, dt: adding dt based numa support using dt node property arm, associativity
  2015-10-09 15:18       ` Catalin Marinas
@ 2015-10-09 16:51           ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-09 16:51 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Grant Likely,
	Leif Lindholm, rfranz-YGCgFSpz5w/QT0dZR+AlfA, Ard Biesheuvel,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Steve Capper,
	Hanjun Guo, Al Stone, Arnd Bergmann, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala

Hi Catalin,


On Fri, Oct 9, 2015 at 8:48 PM, Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org> wrote:
> On Fri, Aug 14, 2015 at 10:09:33PM +0530, Ganapatrao Kulkarni wrote:
>> Adding dt node pasring for numa topology using property arm,associativity.
>> arm,associativity property can be used to map memory, cpu and
>> io devices to numa node.
>>
>> Signed-off-by: Ganapatrao Kulkarni <gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
>> ---
>>  arch/arm64/Kconfig            |   6 +
>>  arch/arm64/include/asm/numa.h |   7 +
>>  arch/arm64/kernel/Makefile    |   1 +
>>  arch/arm64/kernel/dt_numa.c   | 316 ++++++++++++++++++++++++++++++++++++++++++
>>  arch/arm64/kernel/smp.c       |   1 +
>>  arch/arm64/mm/numa.c          |  13 ++
>
> Since a lot of code here is very similar to powerpc, any chance of
> making it common, especially if we try to use the same bindings (maybe
> with a different prefix ("arm," vs "ibm,")? See
> https://members.openpowerfoundation.org/document/dl/469.
there is discussion happened in Documentation patch (PATCH 2/4),
whether to go with associativity based dt binding or not.
https://patchwork.ozlabs.org/patch/507536/
it seems this binding may not map or will be complex representation
for  numa topologies like mesh, ring topologies
which are used in upcoming arm64 numa platforms(as Mark Rutland pointed out).
in this regard, i  have proposed new binding, which will be based on
associativity and acpi.
This binding will serve wide range of arm64 platforms.
i am waiting for review comments for the proposal, however i can post
next version of patchset with new proposal implemented.

>
> Also, since this series is from August, any chance to post another and
> incorporate the feedback provided so far?
>
> --
> Catalin

thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 3/4] arm64, numa, dt: adding dt based numa support using dt node property arm, associativity
@ 2015-10-09 16:51           ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-09 16:51 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,


On Fri, Oct 9, 2015 at 8:48 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Fri, Aug 14, 2015 at 10:09:33PM +0530, Ganapatrao Kulkarni wrote:
>> Adding dt node pasring for numa topology using property arm,associativity.
>> arm,associativity property can be used to map memory, cpu and
>> io devices to numa node.
>>
>> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
>> ---
>>  arch/arm64/Kconfig            |   6 +
>>  arch/arm64/include/asm/numa.h |   7 +
>>  arch/arm64/kernel/Makefile    |   1 +
>>  arch/arm64/kernel/dt_numa.c   | 316 ++++++++++++++++++++++++++++++++++++++++++
>>  arch/arm64/kernel/smp.c       |   1 +
>>  arch/arm64/mm/numa.c          |  13 ++
>
> Since a lot of code here is very similar to powerpc, any chance of
> making it common, especially if we try to use the same bindings (maybe
> with a different prefix ("arm," vs "ibm,")? See
> https://members.openpowerfoundation.org/document/dl/469.
there is discussion happened in Documentation patch (PATCH 2/4),
whether to go with associativity based dt binding or not.
https://patchwork.ozlabs.org/patch/507536/
it seems this binding may not map or will be complex representation
for  numa topologies like mesh, ring topologies
which are used in upcoming arm64 numa platforms(as Mark Rutland pointed out).
in this regard, i  have proposed new binding, which will be based on
associativity and acpi.
This binding will serve wide range of arm64 platforms.
i am waiting for review comments for the proposal, however i can post
next version of patchset with new proposal implemented.

>
> Also, since this series is from August, any chance to post another and
> incorporate the feedback provided so far?
>
> --
> Catalin

thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-10-01 11:36                             ` Ganapatrao Kulkarni
@ 2015-10-13 16:47                                 ` Mark Rutland
  -1 siblings, 0 replies; 94+ messages in thread
From: Mark Rutland @ 2015-10-13 16:47 UTC (permalink / raw)
  To: Ganapatrao Kulkarni
  Cc: Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Rob Herring,
	Leizhen (ThunderTown),
	Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd

> > Hi Mark,
> >
> > i am thinking, if we could not address(or becomes complex)  these topologies
> > using associativity,
> > we should think of an alternate binding which suits existing and upcoming
> > arm64 platforms.
> > can we think of below numa binding which is inline with ACPI and will
> > address all sort of topologies!
> >
> > i am proposing as below,
> >
> > 1. introduce "proximity" node property. this property will be
> > present in dt nodes like memory, cpu, bus and devices(like associativity
> > property) and
> > will tell which numa node(proximity domain) this dt node belongs to.
> >
> > examples:
> >                cpu@000 {
> >                         device_type = "cpu";
> >                         compatible = "cavium,thunder", "arm,armv8";
> >                         reg = <0x0 0x000>;
> >                         enable-method = "psci";
> >                         proximity = <0>;
> >                 };
> >                cpu@001 {
> >                         device_type = "cpu";
> >                         compatible = "cavium,thunder", "arm,armv8";
> >                         reg = <0x0 0x001>;
> >                         enable-method = "psci";
> >                         proximity = <1>;
> >                 };
> >
> >        memory@00000000 {
> >                 device_type = "memory";
> >                 reg = <0x0 0x01400000 0x3 0xFEC00000>;
> >                 proximity =<0>;
> >
> >         };
> >
> >         memory@10000000000 {
> >                 device_type = "memory";
> >                 reg = <0x100 0x00400000 0x3 0xFFC00000>;
> >                 proximity =<1>;
> >         };
> >
> > pcie0@0x8480,00000000 {
> >                 compatible = "cavium,thunder-pcie";
> >                 device_type = "pci";
> >                 msi-parent = <&its>;
> >                 bus-range = <0 255>;
> >                 #size-cells = <2>;
> >                 #address-cells = <3>;
> >                 #stream-id-cells = <1>;
> >                 reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
> > space */
> >                 ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
> > 0x70 0x00000000>, /* mem ranges */
> >                          <0x03000000 0x8300 0x00000000 0x8300 0x00000000
> > 0x500 0x00000000>;
> >                proximity =<0>;
> >         };
> >
> >
> > 2. Introduce new dt node "proximity-map" which will capture the NxN numa
> > node distance matrix.
> >
> > for example,  4 nodes connected in mesh/ring structure as,
> > A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
> > to> A(1)
> >
> > relative distance would be,
> >       A -> B = 20
> >       B -> C  = 20
> >       C -> D = 20
> >       D -> A = 20
> >       A -> C = 40
> >       B -> D = 40
> >
> > and dt presentation for this distance matrix is :
> >
> >        proximity-map {
> >              node-count = <4>;
> >              distance-matrix = <0 0  10>,
> >                                 <0 1  20>,
> >                                 <0 2  40>,
> >                                 <0 3  20>,
> >                                 <1 0  20>,
> >                                 <1 1  10>,
> >                                 <1 2  20>,
> >                                 <1 3  40>,
> >                                 <2 0  40>,
> >                                 <2 1  20>,
> >                                 <2 2  10>,
> >                                 <2 3  20>,
> >                                 <3 0  20>,
> >                                 <3 1  40>,
> >                                 <3 2  20>,
> >                                 <3 3  10>;
> >           }
> >
> > the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
> > put default value(local distance).
> > the entries like <1 0> can be optional if <0 1> and <1 0> are of same
> > distance.
> is this binding looks ok?

This looks roughly requivalent to the ACPI SLIT, which means it's as
powerful, which allays my previous concerns.

> i can implement this and submit in next version of patchset.

Please put together (plaintext) patches.

Then we have a sensible baseline that we can work from; it's somewhat
difficult for others to join the disacussion here as-is.

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-10-13 16:47                                 ` Mark Rutland
  0 siblings, 0 replies; 94+ messages in thread
From: Mark Rutland @ 2015-10-13 16:47 UTC (permalink / raw)
  To: linux-arm-kernel

> > Hi Mark,
> >
> > i am thinking, if we could not address(or becomes complex)  these topologies
> > using associativity,
> > we should think of an alternate binding which suits existing and upcoming
> > arm64 platforms.
> > can we think of below numa binding which is inline with ACPI and will
> > address all sort of topologies!
> >
> > i am proposing as below,
> >
> > 1. introduce "proximity" node property. this property will be
> > present in dt nodes like memory, cpu, bus and devices(like associativity
> > property) and
> > will tell which numa node(proximity domain) this dt node belongs to.
> >
> > examples:
> >                cpu at 000 {
> >                         device_type = "cpu";
> >                         compatible = "cavium,thunder", "arm,armv8";
> >                         reg = <0x0 0x000>;
> >                         enable-method = "psci";
> >                         proximity = <0>;
> >                 };
> >                cpu at 001 {
> >                         device_type = "cpu";
> >                         compatible = "cavium,thunder", "arm,armv8";
> >                         reg = <0x0 0x001>;
> >                         enable-method = "psci";
> >                         proximity = <1>;
> >                 };
> >
> >        memory at 00000000 {
> >                 device_type = "memory";
> >                 reg = <0x0 0x01400000 0x3 0xFEC00000>;
> >                 proximity =<0>;
> >
> >         };
> >
> >         memory at 10000000000 {
> >                 device_type = "memory";
> >                 reg = <0x100 0x00400000 0x3 0xFFC00000>;
> >                 proximity =<1>;
> >         };
> >
> > pcie0 at 0x8480,00000000 {
> >                 compatible = "cavium,thunder-pcie";
> >                 device_type = "pci";
> >                 msi-parent = <&its>;
> >                 bus-range = <0 255>;
> >                 #size-cells = <2>;
> >                 #address-cells = <3>;
> >                 #stream-id-cells = <1>;
> >                 reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
> > space */
> >                 ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
> > 0x70 0x00000000>, /* mem ranges */
> >                          <0x03000000 0x8300 0x00000000 0x8300 0x00000000
> > 0x500 0x00000000>;
> >                proximity =<0>;
> >         };
> >
> >
> > 2. Introduce new dt node "proximity-map" which will capture the NxN numa
> > node distance matrix.
> >
> > for example,  4 nodes connected in mesh/ring structure as,
> > A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
> > to> A(1)
> >
> > relative distance would be,
> >       A -> B = 20
> >       B -> C  = 20
> >       C -> D = 20
> >       D -> A = 20
> >       A -> C = 40
> >       B -> D = 40
> >
> > and dt presentation for this distance matrix is :
> >
> >        proximity-map {
> >              node-count = <4>;
> >              distance-matrix = <0 0  10>,
> >                                 <0 1  20>,
> >                                 <0 2  40>,
> >                                 <0 3  20>,
> >                                 <1 0  20>,
> >                                 <1 1  10>,
> >                                 <1 2  20>,
> >                                 <1 3  40>,
> >                                 <2 0  40>,
> >                                 <2 1  20>,
> >                                 <2 2  10>,
> >                                 <2 3  20>,
> >                                 <3 0  20>,
> >                                 <3 1  40>,
> >                                 <3 2  20>,
> >                                 <3 3  10>;
> >           }
> >
> > the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
> > put default value(local distance).
> > the entries like <1 0> can be optional if <0 1> and <1 0> are of same
> > distance.
> is this binding looks ok?

This looks roughly requivalent to the ACPI SLIT, which means it's as
powerful, which allays my previous concerns.

> i can implement this and submit in next version of patchset.

Please put together (plaintext) patches.

Then we have a sensible baseline that we can work from; it's somewhat
difficult for others to join the disacussion here as-is.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-10-13 16:47                                 ` Mark Rutland
@ 2015-10-13 17:07                                   ` Ganapatrao Kulkarni
  -1 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-13 17:07 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Rob Herring,
	Leizhen (ThunderTown),
	Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	hanjun.guo-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd

On Tue, Oct 13, 2015 at 10:17 PM, Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> wrote:
>> > Hi Mark,
>> >
>> > i am thinking, if we could not address(or becomes complex)  these topologies
>> > using associativity,
>> > we should think of an alternate binding which suits existing and upcoming
>> > arm64 platforms.
>> > can we think of below numa binding which is inline with ACPI and will
>> > address all sort of topologies!
>> >
>> > i am proposing as below,
>> >
>> > 1. introduce "proximity" node property. this property will be
>> > present in dt nodes like memory, cpu, bus and devices(like associativity
>> > property) and
>> > will tell which numa node(proximity domain) this dt node belongs to.
>> >
>> > examples:
>> >                cpu@000 {
>> >                         device_type = "cpu";
>> >                         compatible = "cavium,thunder", "arm,armv8";
>> >                         reg = <0x0 0x000>;
>> >                         enable-method = "psci";
>> >                         proximity = <0>;
>> >                 };
>> >                cpu@001 {
>> >                         device_type = "cpu";
>> >                         compatible = "cavium,thunder", "arm,armv8";
>> >                         reg = <0x0 0x001>;
>> >                         enable-method = "psci";
>> >                         proximity = <1>;
>> >                 };
>> >
>> >        memory@00000000 {
>> >                 device_type = "memory";
>> >                 reg = <0x0 0x01400000 0x3 0xFEC00000>;
>> >                 proximity =<0>;
>> >
>> >         };
>> >
>> >         memory@10000000000 {
>> >                 device_type = "memory";
>> >                 reg = <0x100 0x00400000 0x3 0xFFC00000>;
>> >                 proximity =<1>;
>> >         };
>> >
>> > pcie0@0x8480,00000000 {
>> >                 compatible = "cavium,thunder-pcie";
>> >                 device_type = "pci";
>> >                 msi-parent = <&its>;
>> >                 bus-range = <0 255>;
>> >                 #size-cells = <2>;
>> >                 #address-cells = <3>;
>> >                 #stream-id-cells = <1>;
>> >                 reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
>> > space */
>> >                 ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
>> > 0x70 0x00000000>, /* mem ranges */
>> >                          <0x03000000 0x8300 0x00000000 0x8300 0x00000000
>> > 0x500 0x00000000>;
>> >                proximity =<0>;
>> >         };
>> >
>> >
>> > 2. Introduce new dt node "proximity-map" which will capture the NxN numa
>> > node distance matrix.
>> >
>> > for example,  4 nodes connected in mesh/ring structure as,
>> > A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
>> > to> A(1)
>> >
>> > relative distance would be,
>> >       A -> B = 20
>> >       B -> C  = 20
>> >       C -> D = 20
>> >       D -> A = 20
>> >       A -> C = 40
>> >       B -> D = 40
>> >
>> > and dt presentation for this distance matrix is :
>> >
>> >        proximity-map {
>> >              node-count = <4>;
>> >              distance-matrix = <0 0  10>,
>> >                                 <0 1  20>,
>> >                                 <0 2  40>,
>> >                                 <0 3  20>,
>> >                                 <1 0  20>,
>> >                                 <1 1  10>,
>> >                                 <1 2  20>,
>> >                                 <1 3  40>,
>> >                                 <2 0  40>,
>> >                                 <2 1  20>,
>> >                                 <2 2  10>,
>> >                                 <2 3  20>,
>> >                                 <3 0  20>,
>> >                                 <3 1  40>,
>> >                                 <3 2  20>,
>> >                                 <3 3  10>;
>> >           }
>> >
>> > the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
>> > put default value(local distance).
>> > the entries like <1 0> can be optional if <0 1> and <1 0> are of same
>> > distance.
>> is this binding looks ok?
>
> This looks roughly requivalent to the ACPI SLIT, which means it's as
> powerful, which allays my previous concerns.
>
>> i can implement this and submit in next version of patchset.
>
> Please put together (plaintext) patches.
>
> Then we have a sensible baseline that we can work from; it's somewhat
> difficult for others to join the disacussion here as-is.
thanks, will post the v6 in couple of days with implementation based
on this binding proposal..
>
> Thanks,
> Mark.

thanks
Ganapat
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-10-13 17:07                                   ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 94+ messages in thread
From: Ganapatrao Kulkarni @ 2015-10-13 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Oct 13, 2015 at 10:17 PM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > Hi Mark,
>> >
>> > i am thinking, if we could not address(or becomes complex)  these topologies
>> > using associativity,
>> > we should think of an alternate binding which suits existing and upcoming
>> > arm64 platforms.
>> > can we think of below numa binding which is inline with ACPI and will
>> > address all sort of topologies!
>> >
>> > i am proposing as below,
>> >
>> > 1. introduce "proximity" node property. this property will be
>> > present in dt nodes like memory, cpu, bus and devices(like associativity
>> > property) and
>> > will tell which numa node(proximity domain) this dt node belongs to.
>> >
>> > examples:
>> >                cpu at 000 {
>> >                         device_type = "cpu";
>> >                         compatible = "cavium,thunder", "arm,armv8";
>> >                         reg = <0x0 0x000>;
>> >                         enable-method = "psci";
>> >                         proximity = <0>;
>> >                 };
>> >                cpu at 001 {
>> >                         device_type = "cpu";
>> >                         compatible = "cavium,thunder", "arm,armv8";
>> >                         reg = <0x0 0x001>;
>> >                         enable-method = "psci";
>> >                         proximity = <1>;
>> >                 };
>> >
>> >        memory at 00000000 {
>> >                 device_type = "memory";
>> >                 reg = <0x0 0x01400000 0x3 0xFEC00000>;
>> >                 proximity =<0>;
>> >
>> >         };
>> >
>> >         memory at 10000000000 {
>> >                 device_type = "memory";
>> >                 reg = <0x100 0x00400000 0x3 0xFFC00000>;
>> >                 proximity =<1>;
>> >         };
>> >
>> > pcie0 at 0x8480,00000000 {
>> >                 compatible = "cavium,thunder-pcie";
>> >                 device_type = "pci";
>> >                 msi-parent = <&its>;
>> >                 bus-range = <0 255>;
>> >                 #size-cells = <2>;
>> >                 #address-cells = <3>;
>> >                 #stream-id-cells = <1>;
>> >                 reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
>> > space */
>> >                 ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
>> > 0x70 0x00000000>, /* mem ranges */
>> >                          <0x03000000 0x8300 0x00000000 0x8300 0x00000000
>> > 0x500 0x00000000>;
>> >                proximity =<0>;
>> >         };
>> >
>> >
>> > 2. Introduce new dt node "proximity-map" which will capture the NxN numa
>> > node distance matrix.
>> >
>> > for example,  4 nodes connected in mesh/ring structure as,
>> > A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
>> > to> A(1)
>> >
>> > relative distance would be,
>> >       A -> B = 20
>> >       B -> C  = 20
>> >       C -> D = 20
>> >       D -> A = 20
>> >       A -> C = 40
>> >       B -> D = 40
>> >
>> > and dt presentation for this distance matrix is :
>> >
>> >        proximity-map {
>> >              node-count = <4>;
>> >              distance-matrix = <0 0  10>,
>> >                                 <0 1  20>,
>> >                                 <0 2  40>,
>> >                                 <0 3  20>,
>> >                                 <1 0  20>,
>> >                                 <1 1  10>,
>> >                                 <1 2  20>,
>> >                                 <1 3  40>,
>> >                                 <2 0  40>,
>> >                                 <2 1  20>,
>> >                                 <2 2  10>,
>> >                                 <2 3  20>,
>> >                                 <3 0  20>,
>> >                                 <3 1  40>,
>> >                                 <3 2  20>,
>> >                                 <3 3  10>;
>> >           }
>> >
>> > the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
>> > put default value(local distance).
>> > the entries like <1 0> can be optional if <0 1> and <1 0> are of same
>> > distance.
>> is this binding looks ok?
>
> This looks roughly requivalent to the ACPI SLIT, which means it's as
> powerful, which allays my previous concerns.
>
>> i can implement this and submit in next version of patchset.
>
> Please put together (plaintext) patches.
>
> Then we have a sensible baseline that we can work from; it's somewhat
> difficult for others to join the disacussion here as-is.
thanks, will post the v6 in couple of days with implementation based
on this binding proposal..
>
> Thanks,
> Mark.

thanks
Ganapat

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-10-13 16:47                                 ` Mark Rutland
@ 2015-10-14 13:21                                   ` Hanjun Guo
  -1 siblings, 0 replies; 94+ messages in thread
From: Hanjun Guo @ 2015-10-14 13:21 UTC (permalink / raw)
  To: Mark Rutland, Ganapatrao Kulkarni
  Cc: Prasun.Kapoor-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8, Rob Herring,
	Leizhen (ThunderTown),
	Ganapatrao Kulkarni,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Will Deacon, Catalin Marinas,
	grant.likely-QSEj5FYQhm4dnm+yROfE0A,
	leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	rfranz-YGCgFSpz5w/QT0dZR+AlfA,
	ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A,
	msalter-H+wXaHxf7aLQT0dZR+AlfA, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	steve.capper-QSEj5FYQhm4dnm+yROfE0A,
	al.stone-QSEj5FYQhm4dnm+yROfE0A, arnd-r2nGTMty4D4, Pawel Moll

On 10/14/2015 12:47 AM, Mark Rutland wrote:
>>> Hi Mark,
>>>
>>> i am thinking, if we could not address(or becomes complex)  these topologies
>>> using associativity,
>>> we should think of an alternate binding which suits existing and upcoming
>>> arm64 platforms.
>>> can we think of below numa binding which is inline with ACPI and will
>>> address all sort of topologies!
>>>
>>> i am proposing as below,
>>>
>>> 1. introduce "proximity" node property. this property will be
>>> present in dt nodes like memory, cpu, bus and devices(like associativity
>>> property) and
>>> will tell which numa node(proximity domain) this dt node belongs to.
>>>
>>> examples:
>>>                 cpu@000 {
>>>                          device_type = "cpu";
>>>                          compatible = "cavium,thunder", "arm,armv8";
>>>                          reg = <0x0 0x000>;
>>>                          enable-method = "psci";
>>>                          proximity = <0>;
>>>                  };
>>>                 cpu@001 {
>>>                          device_type = "cpu";
>>>                          compatible = "cavium,thunder", "arm,armv8";
>>>                          reg = <0x0 0x001>;
>>>                          enable-method = "psci";
>>>                          proximity = <1>;
>>>                  };
>>>
>>>         memory@00000000 {
>>>                  device_type = "memory";
>>>                  reg = <0x0 0x01400000 0x3 0xFEC00000>;
>>>                  proximity =<0>;
>>>
>>>          };
>>>
>>>          memory@10000000000 {
>>>                  device_type = "memory";
>>>                  reg = <0x100 0x00400000 0x3 0xFFC00000>;
>>>                  proximity =<1>;
>>>          };
>>>
>>> pcie0@0x8480,00000000 {
>>>                  compatible = "cavium,thunder-pcie";
>>>                  device_type = "pci";
>>>                  msi-parent = <&its>;
>>>                  bus-range = <0 255>;
>>>                  #size-cells = <2>;
>>>                  #address-cells = <3>;
>>>                  #stream-id-cells = <1>;
>>>                  reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
>>> space */
>>>                  ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
>>> 0x70 0x00000000>, /* mem ranges */
>>>                           <0x03000000 0x8300 0x00000000 0x8300 0x00000000
>>> 0x500 0x00000000>;
>>>                 proximity =<0>;
>>>          };
>>>
>>>
>>> 2. Introduce new dt node "proximity-map" which will capture the NxN numa
>>> node distance matrix.
>>>
>>> for example,  4 nodes connected in mesh/ring structure as,
>>> A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
>>> to> A(1)
>>>
>>> relative distance would be,
>>>        A -> B = 20
>>>        B -> C  = 20
>>>        C -> D = 20
>>>        D -> A = 20
>>>        A -> C = 40
>>>        B -> D = 40
>>>
>>> and dt presentation for this distance matrix is :
>>>
>>>         proximity-map {
>>>               node-count = <4>;
>>>               distance-matrix = <0 0  10>,
>>>                                  <0 1  20>,
>>>                                  <0 2  40>,
>>>                                  <0 3  20>,
>>>                                  <1 0  20>,
>>>                                  <1 1  10>,
>>>                                  <1 2  20>,
>>>                                  <1 3  40>,
>>>                                  <2 0  40>,
>>>                                  <2 1  20>,
>>>                                  <2 2  10>,
>>>                                  <2 3  20>,
>>>                                  <3 0  20>,
>>>                                  <3 1  40>,
>>>                                  <3 2  20>,
>>>                                  <3 3  10>;
>>>            }
>>>
>>> the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
>>> put default value(local distance).
>>> the entries like <1 0> can be optional if <0 1> and <1 0> are of same
>>> distance.
>> is this binding looks ok?
>
> This looks roughly requivalent to the ACPI SLIT, which means it's as
> powerful, which allays my previous concerns.

Cool, I think those bindings are quite extensible and easy understood.

Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa.
@ 2015-10-14 13:21                                   ` Hanjun Guo
  0 siblings, 0 replies; 94+ messages in thread
From: Hanjun Guo @ 2015-10-14 13:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/14/2015 12:47 AM, Mark Rutland wrote:
>>> Hi Mark,
>>>
>>> i am thinking, if we could not address(or becomes complex)  these topologies
>>> using associativity,
>>> we should think of an alternate binding which suits existing and upcoming
>>> arm64 platforms.
>>> can we think of below numa binding which is inline with ACPI and will
>>> address all sort of topologies!
>>>
>>> i am proposing as below,
>>>
>>> 1. introduce "proximity" node property. this property will be
>>> present in dt nodes like memory, cpu, bus and devices(like associativity
>>> property) and
>>> will tell which numa node(proximity domain) this dt node belongs to.
>>>
>>> examples:
>>>                 cpu at 000 {
>>>                          device_type = "cpu";
>>>                          compatible = "cavium,thunder", "arm,armv8";
>>>                          reg = <0x0 0x000>;
>>>                          enable-method = "psci";
>>>                          proximity = <0>;
>>>                  };
>>>                 cpu at 001 {
>>>                          device_type = "cpu";
>>>                          compatible = "cavium,thunder", "arm,armv8";
>>>                          reg = <0x0 0x001>;
>>>                          enable-method = "psci";
>>>                          proximity = <1>;
>>>                  };
>>>
>>>         memory at 00000000 {
>>>                  device_type = "memory";
>>>                  reg = <0x0 0x01400000 0x3 0xFEC00000>;
>>>                  proximity =<0>;
>>>
>>>          };
>>>
>>>          memory at 10000000000 {
>>>                  device_type = "memory";
>>>                  reg = <0x100 0x00400000 0x3 0xFFC00000>;
>>>                  proximity =<1>;
>>>          };
>>>
>>> pcie0 at 0x8480,00000000 {
>>>                  compatible = "cavium,thunder-pcie";
>>>                  device_type = "pci";
>>>                  msi-parent = <&its>;
>>>                  bus-range = <0 255>;
>>>                  #size-cells = <2>;
>>>                  #address-cells = <3>;
>>>                  #stream-id-cells = <1>;
>>>                  reg = <0x8480 0x00000000 0 0x10000000>;  /*Configuration
>>> space */
>>>                  ranges = <0x03000000 0x8010 0x00000000 0x8010 0x00000000
>>> 0x70 0x00000000>, /* mem ranges */
>>>                           <0x03000000 0x8300 0x00000000 0x8300 0x00000000
>>> 0x500 0x00000000>;
>>>                 proximity =<0>;
>>>          };
>>>
>>>
>>> 2. Introduce new dt node "proximity-map" which will capture the NxN numa
>>> node distance matrix.
>>>
>>> for example,  4 nodes connected in mesh/ring structure as,
>>> A(0) <connected to> B(1) <connected to> C(2) <connected to> D(3) <connected
>>> to> A(1)
>>>
>>> relative distance would be,
>>>        A -> B = 20
>>>        B -> C  = 20
>>>        C -> D = 20
>>>        D -> A = 20
>>>        A -> C = 40
>>>        B -> D = 40
>>>
>>> and dt presentation for this distance matrix is :
>>>
>>>         proximity-map {
>>>               node-count = <4>;
>>>               distance-matrix = <0 0  10>,
>>>                                  <0 1  20>,
>>>                                  <0 2  40>,
>>>                                  <0 3  20>,
>>>                                  <1 0  20>,
>>>                                  <1 1  10>,
>>>                                  <1 2  20>,
>>>                                  <1 3  40>,
>>>                                  <2 0  40>,
>>>                                  <2 1  20>,
>>>                                  <2 2  10>,
>>>                                  <2 3  20>,
>>>                                  <3 0  20>,
>>>                                  <3 1  40>,
>>>                                  <3 2  20>,
>>>                                  <3 3  10>;
>>>            }
>>>
>>> the entries like < 0 0 > < 1 1>  < 2 2> < 3 3> can be optional and code can
>>> put default value(local distance).
>>> the entries like <1 0> can be optional if <0 1> and <1 0> are of same
>>> distance.
>> is this binding looks ok?
>
> This looks roughly requivalent to the ACPI SLIT, which means it's as
> powerful, which allays my previous concerns.

Cool, I think those bindings are quite extensible and easy understood.

Thanks
Hanjun

^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2015-10-14 13:21 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-14 16:39 [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms Ganapatrao Kulkarni
2015-08-14 16:39 ` Ganapatrao Kulkarni
2015-08-14 16:39 ` [PATCH v5 1/4] arm64, numa: adding " Ganapatrao Kulkarni
2015-08-14 16:39   ` Ganapatrao Kulkarni
     [not found]   ` <1439570374-4079-2-git-send-email-gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
2015-09-03  9:52     ` Ganapatrao Kulkarni
2015-09-03  9:52       ` Ganapatrao Kulkarni
     [not found]       ` <CAFpQJXXiU3LcUWvjP8r4pHUysYe7X4DJBvjhXzf+p4qaJdBVWA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-03 10:13         ` Will Deacon
2015-09-03 10:13           ` Will Deacon
     [not found]           ` <20150903101328.GB877-5wv7dgnIgG8@public.gmane.org>
2015-09-29  8:43             ` Ganapatrao Kulkarni
2015-09-29  8:43               ` Ganapatrao Kulkarni
2015-10-05  5:24     ` Ganapatrao Kulkarni
2015-10-05  5:24       ` Ganapatrao Kulkarni
2015-08-14 16:39 ` [PATCH v5 2/4] Documentation: arm64/arm: dt bindings for numa Ganapatrao Kulkarni
2015-08-14 16:39   ` Ganapatrao Kulkarni
2015-08-22 15:06   ` Robert Richter
2015-08-22 15:06     ` Robert Richter
     [not found]   ` <1439570374-4079-3-git-send-email-gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
2015-08-23 21:49     ` Rob Herring
2015-08-23 21:49       ` Rob Herring
2015-08-28 11:32     ` Matthias Brugger
2015-08-28 11:32       ` Matthias Brugger
2015-08-28 12:32     ` Mark Rutland
2015-08-28 12:32       ` Mark Rutland
2015-08-28 14:02       ` Rob Herring
2015-08-28 14:02         ` Rob Herring
     [not found]         ` <CAL_JsqLOpUTVeKbkJ6uPi0EYistd=BowAHdtSmaUYYDhU6c9Sg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-08-28 21:37           ` Benjamin Herrenschmidt
2015-08-28 21:37             ` Benjamin Herrenschmidt
     [not found]             ` <1440797856.2912.239.camel-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
2015-09-02 17:11               ` Ganapatrao Kulkarni
2015-09-02 17:11                 ` Ganapatrao Kulkarni
2015-08-29  9:46           ` Leizhen (ThunderTown)
2015-08-29  9:46             ` Leizhen (ThunderTown)
     [not found]             ` <55E17F58.5020101-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2015-08-29 10:37               ` Benjamin Herrenschmidt
2015-08-29 10:37                 ` Benjamin Herrenschmidt
     [not found]                 ` <1440844631.2912.248.camel-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
2015-08-31  1:46                   ` Leizhen (ThunderTown)
2015-08-31  1:46                     ` Leizhen (ThunderTown)
2015-08-29 14:56               ` Ganapatrao Kulkarni
2015-08-29 14:56                 ` Ganapatrao Kulkarni
2015-08-31  2:53                 ` Leizhen (ThunderTown)
2015-08-31  2:53                   ` Leizhen (ThunderTown)
     [not found]                 ` <CAFpQJXWO0xT0kxWf09L_XcAOXm1Lov8i2U2BQzk-x0TStj7vBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-08 13:27                   ` Hanjun Guo
2015-09-08 13:27                     ` Hanjun Guo
     [not found]                     ` <55EEE229.7070301-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2015-09-08 16:27                       ` Ganapatrao Kulkarni
2015-09-08 16:27                         ` Ganapatrao Kulkarni
     [not found]                         ` <CAFpQJXV66dAfDMttMNTApyy1554-rvw=Q0n60BOtsp=61gJ+zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-11  3:53                           ` Ganapatrao Kulkarni
2015-09-11  3:53                             ` Ganapatrao Kulkarni
     [not found]                             ` <CAFpQJXX0KYiKKgxM3B467PFLS0jCakTp8A2u=QEs8_Fj4EZYBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-11  6:43                               ` Leizhen (ThunderTown)
2015-09-11  6:43                                 ` Leizhen (ThunderTown)
2015-09-29  8:35       ` Ganapatrao Kulkarni
     [not found]         ` <CAFpQJXWzM644KsFWP9ei-k6gWgNVpBVT+UbY7NYdyfmyL=zMkw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-29  8:38           ` Ganapatrao Kulkarni
2015-09-29  8:38             ` Ganapatrao Kulkarni
     [not found]             ` <CAFpQJXWFx2dQ_vv0POqkOiLe3eh9Ee=Sf+2Xx_08Vp=E4g6gLQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-29  9:42               ` Benjamin Herrenschmidt
2015-09-29  9:42                 ` Benjamin Herrenschmidt
2015-09-30  0:28               ` Benjamin Herrenschmidt
2015-09-30  0:28                 ` Benjamin Herrenschmidt
     [not found]                 ` <1443572883.2865.42.camel-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
2015-09-30 10:19                   ` Ganapatrao Kulkarni
2015-09-30 10:19                     ` Ganapatrao Kulkarni
2015-09-30 10:53               ` Mark Rutland
2015-09-30 10:53                 ` Mark Rutland
2015-09-30 17:50                 ` Ganapatrao Kulkarni
2015-09-30 17:50                   ` Ganapatrao Kulkarni
     [not found]                   ` <CAFpQJXVegqortRLK+g7_9p82KMYwq0hg8n3_9bJqjGHsJvngaA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-01  1:05                     ` Benjamin Herrenschmidt
2015-10-01  1:05                       ` Benjamin Herrenschmidt
2015-10-01  5:11                       ` Ganapatrao Kulkarni
     [not found]                         ` <CAFpQJXXKcwks0iZN+3B=U0-9uYKFpAXcZE90GCHN9WyM45Hdpw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-01  5:25                           ` Ganapatrao Kulkarni
2015-10-01  5:25                             ` Ganapatrao Kulkarni
2015-10-01  7:17                           ` Benjamin Herrenschmidt
2015-10-01  7:17                             ` Benjamin Herrenschmidt
2015-10-01 11:36                           ` Ganapatrao Kulkarni
2015-10-01 11:36                             ` Ganapatrao Kulkarni
     [not found]                             ` <CAFpQJXUPWZxOWeJQnkw8E_voDoqhphH8iR-fr0xo0m3+FiL4sA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-13 16:47                               ` Mark Rutland
2015-10-13 16:47                                 ` Mark Rutland
2015-10-13 17:07                                 ` Ganapatrao Kulkarni
2015-10-13 17:07                                   ` Ganapatrao Kulkarni
2015-10-14 13:21                                 ` Hanjun Guo
2015-10-14 13:21                                   ` Hanjun Guo
2015-08-14 16:39 ` [PATCH v5 3/4] arm64, numa, dt: adding dt based numa support using dt node property arm, associativity Ganapatrao Kulkarni
2015-08-14 16:39   ` Ganapatrao Kulkarni
     [not found]   ` <1439570374-4079-4-git-send-email-gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
2015-10-09 15:18     ` Catalin Marinas
2015-10-09 15:18       ` Catalin Marinas
     [not found]       ` <20151009151850.GX17192-M2fw3Uu6cmfZROr8t4l/smS4ubULX0JqMm0uRHvK7Nw@public.gmane.org>
2015-10-09 16:51         ` Ganapatrao Kulkarni
2015-10-09 16:51           ` Ganapatrao Kulkarni
2015-08-14 16:39 ` [PATCH v5 4/4] arm64, dt, thunderx: Add initial dts for Cavium Thunder SoC in 2 Node topology Ganapatrao Kulkarni
2015-08-14 16:39   ` Ganapatrao Kulkarni
     [not found]   ` <1439570374-4079-5-git-send-email-gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
2015-08-18  6:16     ` Jisheng Zhang
2015-08-18  6:16       ` Jisheng Zhang
     [not found] ` <1439570374-4079-1-git-send-email-gkulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
2015-08-14 16:44   ` [PATCH v5 0/8] arm64, numa: Add numa support for arm64 platforms Ganapatrao Kulkarni
2015-08-14 16:44     ` Ganapatrao Kulkarni
     [not found]     ` <CAFpQJXXXFmWE5rk9=KW_Vg0B9aA+Cd_6YQOeiCZ21Q61cSQ+ew-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-08-20  6:50       ` Ganapatrao Kulkarni
2015-08-20  6:50         ` Ganapatrao Kulkarni
2015-08-28 14:31   ` Matthias Brugger
2015-08-28 14:31     ` Matthias Brugger
     [not found]     ` <55E070D6.8060604-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-08-28 14:59       ` Ganapatrao Kulkarni
2015-08-28 14:59         ` Ganapatrao Kulkarni
     [not found]         ` <CAFpQJXUYycrvxp_p=A_pmeNUpHVs=8kYUfZP9dYDhYLqeCTZ3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-08-28 15:36           ` Matthias Brugger
2015-08-28 15:36             ` Matthias Brugger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.